Search Results: "casi"

13 December 2023

Melissa Wen: 15 Tips for Debugging Issues in the AMD Display Kernel Driver

A self-help guide for examining and debugging the AMD display driver within the Linux kernel/DRM subsystem. It s based on my experience as an external developer working on the driver, and are shared with the goal of helping others navigate the driver code. Acknowledgments: These tips were gathered thanks to the countless help received from AMD developers during the driver development process. The list below was obtained by examining open source code, reviewing public documentation, playing with tools, asking in public forums and also with the help of my former GSoC mentor, Rodrigo Siqueira.

Pre-Debugging Steps: Before diving into an issue, it s crucial to perform two essential steps: 1) Check the latest changes: Ensure you re working with the latest AMD driver modifications located in the amd-staging-drm-next branch maintained by Alex Deucher. You may also find bug fixes for newer kernel versions on branches that have the name pattern drm-fixes-<date>. 2) Examine the issue tracker: Confirm that your issue isn t already documented and addressed in the AMD display driver issue tracker. If you find a similar issue, you can team up with others and speed up the debugging process.

Understanding the issue: Do you really need to change this? Where should you start looking for changes? 3) Is the issue in the AMD kernel driver or in the userspace?: Identifying the source of the issue is essential regardless of the GPU vendor. Sometimes this can be challenging so here are some helpful tips:
  • Record the screen: Capture the screen using a recording app while experiencing the issue. If the bug appears in the capture, it s likely a userspace issue, not the kernel display driver.
  • Analyze the dmesg log: Look for error messages related to the display driver in the dmesg log. If the error message appears before the message [drm] Display Core v... , it s not likely a display driver issue. If this message doesn t appear in your log, the display driver wasn t fully loaded and you will see a notification that something went wrong here.
4) AMD Display Manager vs. AMD Display Core: The AMD display driver consists of two components:
  • Display Manager (DM): This component interacts directly with the Linux DRM infrastructure. Occasionally, issues can arise from misinterpretations of DRM properties or features. If the issue doesn t occur on other platforms with the same AMD hardware - for example, only happens on Linux but not on Windows - it s more likely related to the AMD DM code.
  • Display Core (DC): This is the platform-agnostic part responsible for setting and programming hardware features. Modifications to the DC usually require validation on other platforms, like Windows, to avoid regressions.
5) Identify the DC HW family: Each AMD GPU has variations in its hardware architecture. Features and helpers differ between families, so determining the relevant code for your specific hardware is crucial.
  • Find GPU product information in Linux/AMD GPU documentation
  • Check the dmesg log for the Display Core version (since this commit in Linux kernel 6.3v). For example:
    • [drm] Display Core v3.2.241 initialized on DCN 2.1
    • [drm] Display Core v3.2.237 initialized on DCN 3.0.1

Investigating the relevant driver code: Keep from letting unrelated driver code to affect your investigation. 6) Narrow the code inspection down to one DC HW family: the relevant code resides in a directory named after the DC number. For example, the DCN 3.0.1 driver code is located at drivers/gpu/drm/amd/display/dc/dcn301. We all know that the AMD s shared code is huge and you can use these boundaries to rule out codes unrelated to your issue. 7) Newer families may inherit code from older ones: you can find dcn301 using code from dcn30, dcn20, dcn10 files. It s crucial to verify which hooks and helpers your driver utilizes to investigate the right portion. You can leverage ftrace for supplemental validation. To give an example, it was useful when I was updating DCN3 color mapping to correctly use their new post-blending color capabilities, such as: Additionally, you can use two different HW families to compare behaviours. If you see the issue in one but not in the other, you can compare the code and understand what has changed and if the implementation from a previous family doesn t fit well the new HW resources or design. You can also count on the help of the community on the Linux AMD issue tracker to validate your code on other hardware and/or systems. This approach helped me debug a 2-year-old issue where the cursor gamma adjustment was incorrect in DCN3 hardware, but working correctly for DCN2 family. I solved the issue in two steps, thanks for community feedback and validation: 8) Check the hardware capability screening in the driver: You can currently find a list of display hardware capabilities in the drivers/gpu/drm/amd/display/dc/dcn*/dcn*_resource.c file. More precisely in the dcn*_resource_construct() function. Using DCN301 for illustration, here is the list of its hardware caps:
	/*************************************************
	 *  Resource + asic cap harcoding                *
	 *************************************************/
	pool->base.underlay_pipe_index = NO_UNDERLAY_PIPE;
	pool->base.pipe_count = pool->base.res_cap->num_timing_generator;
	pool->base.mpcc_count = pool->base.res_cap->num_timing_generator;
	dc->caps.max_downscale_ratio = 600;
	dc->caps.i2c_speed_in_khz = 100;
	dc->caps.i2c_speed_in_khz_hdcp = 5; /*1.4 w/a enabled by default*/
	dc->caps.max_cursor_size = 256;
	dc->caps.min_horizontal_blanking_period = 80;
	dc->caps.dmdata_alloc_size = 2048;
	dc->caps.max_slave_planes = 2;
	dc->caps.max_slave_yuv_planes = 2;
	dc->caps.max_slave_rgb_planes = 2;
	dc->caps.is_apu = true;
	dc->caps.post_blend_color_processing = true;
	dc->caps.force_dp_tps4_for_cp2520 = true;
	dc->caps.extended_aux_timeout_support = true;
	dc->caps.dmcub_support = true;
	/* Color pipeline capabilities */
	dc->caps.color.dpp.dcn_arch = 1;
	dc->caps.color.dpp.input_lut_shared = 0;
	dc->caps.color.dpp.icsc = 1;
	dc->caps.color.dpp.dgam_ram = 0; // must use gamma_corr
	dc->caps.color.dpp.dgam_rom_caps.srgb = 1;
	dc->caps.color.dpp.dgam_rom_caps.bt2020 = 1;
	dc->caps.color.dpp.dgam_rom_caps.gamma2_2 = 1;
	dc->caps.color.dpp.dgam_rom_caps.pq = 1;
	dc->caps.color.dpp.dgam_rom_caps.hlg = 1;
	dc->caps.color.dpp.post_csc = 1;
	dc->caps.color.dpp.gamma_corr = 1;
	dc->caps.color.dpp.dgam_rom_for_yuv = 0;
	dc->caps.color.dpp.hw_3d_lut = 1;
	dc->caps.color.dpp.ogam_ram = 1;
	// no OGAM ROM on DCN301
	dc->caps.color.dpp.ogam_rom_caps.srgb = 0;
	dc->caps.color.dpp.ogam_rom_caps.bt2020 = 0;
	dc->caps.color.dpp.ogam_rom_caps.gamma2_2 = 0;
	dc->caps.color.dpp.ogam_rom_caps.pq = 0;
	dc->caps.color.dpp.ogam_rom_caps.hlg = 0;
	dc->caps.color.dpp.ocsc = 0;
	dc->caps.color.mpc.gamut_remap = 1;
	dc->caps.color.mpc.num_3dluts = pool->base.res_cap->num_mpc_3dlut; //2
	dc->caps.color.mpc.ogam_ram = 1;
	dc->caps.color.mpc.ogam_rom_caps.srgb = 0;
	dc->caps.color.mpc.ogam_rom_caps.bt2020 = 0;
	dc->caps.color.mpc.ogam_rom_caps.gamma2_2 = 0;
	dc->caps.color.mpc.ogam_rom_caps.pq = 0;
	dc->caps.color.mpc.ogam_rom_caps.hlg = 0;
	dc->caps.color.mpc.ocsc = 1;
	dc->caps.dp_hdmi21_pcon_support = true;
	/* read VBIOS LTTPR caps */
	if (ctx->dc_bios->funcs->get_lttpr_caps)  
		enum bp_result bp_query_result;
		uint8_t is_vbios_lttpr_enable = 0;
		bp_query_result = ctx->dc_bios->funcs->get_lttpr_caps(ctx->dc_bios, &is_vbios_lttpr_enable);
		dc->caps.vbios_lttpr_enable = (bp_query_result == BP_RESULT_OK) && !!is_vbios_lttpr_enable;
	 
	if (ctx->dc_bios->funcs->get_lttpr_interop)  
		enum bp_result bp_query_result;
		uint8_t is_vbios_interop_enabled = 0;
		bp_query_result = ctx->dc_bios->funcs->get_lttpr_interop(ctx->dc_bios, &is_vbios_interop_enabled);
		dc->caps.vbios_lttpr_aware = (bp_query_result == BP_RESULT_OK) && !!is_vbios_interop_enabled;
	 
Keep in mind that the documentation of color capabilities are available at the Linux kernel Documentation.

Understanding the development history: What has brought us to the current state? 9) Pinpoint relevant commits: Use git log and git blame to identify commits targeting the code section you re interested in. 10) Track regressions: If you re examining the amd-staging-drm-next branch, check for regressions between DC release versions. These are defined by DC_VER in the drivers/gpu/drm/amd/display/dc/dc.h file. Alternatively, find a commit with this format drm/amd/display: 3.2.221 that determines a display release. It s useful for bisecting. This information helps you understand how outdated your branch is and identify potential regressions. You can consider each DC_VER takes around one week to be bumped. Finally, check testing log of each release in the report provided on the amd-gfx mailing list, such as this one Tested-by: Daniel Wheeler:

Reducing the inspection area: Focus on what really matters. 11) Identify involved HW blocks: This helps isolate the issue. You can find more information about DCN HW blocks in the DCN Overview documentation. In summary:
  • Plane issues are closer to HUBP and DPP.
  • Blending/Stream issues are closer to MPC, OPP and OPTC. They are related to DRM CRTC subjects.
This information was useful when debugging a hardware rotation issue where the cursor plane got clipped off in the middle of the screen. Finally, the issue was addressed by two patches: 12) Issues around bandwidth (glitches) and clocks: May be affected by calculations done in these HW blocks and HW specific values. The recalculation equations are found in the DML folder. DML stands for Display Mode Library. It s in charge of all required configuration parameters supported by the hardware for multiple scenarios. See more in the AMD DC Overview kernel docs. It s a math library that optimally configures hardware to find the best balance between power efficiency and performance in a given scenario. Finding some clk variables that affect device behavior may be a sign of it. It s hard for a external developer to debug this part, since it involves information from HW specs and firmware programming that we don t have access. The best option is to provide all relevant debugging information you have and ask AMD developers to check the values from your suspicions.
  • Do a trick: If you suspect the power setup is degrading performance, try setting the amount of power supplied to the GPU to the maximum and see if it affects the system behavior with this command: sudo bash -c "echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level"
I learned it when debugging glitches with hardware cursor rotation on Steam Deck. My first attempt was changing the clock calculation. In the end, Rodrigo Siqueira proposed the right solution targeting bandwidth in two steps:

Checking implicit programming and hardware limitations: Bring implicit programming to the level of consciousness and recognize hardware limitations. 13) Implicit update types: Check if the selected type for atomic update may affect your issue. The update type depends on the mode settings, since programming some modes demands more time for hardware processing. More details in the source code:
/* Surface update type is used by dc_update_surfaces_and_stream
 * The update type is determined at the very beginning of the function based
 * on parameters passed in and decides how much programming (or updating) is
 * going to be done during the call.
 *
 * UPDATE_TYPE_FAST is used for really fast updates that do not require much
 * logical calculations or hardware register programming. This update MUST be
 * ISR safe on windows. Currently fast update will only be used to flip surface
 * address.
 *
 * UPDATE_TYPE_MED is used for slower updates which require significant hw
 * re-programming however do not affect bandwidth consumption or clock
 * requirements. At present, this is the level at which front end updates
 * that do not require us to run bw_calcs happen. These are in/out transfer func
 * updates, viewport offset changes, recout size changes and pixel
depth changes.
 * This update can be done at ISR, but we want to minimize how often
this happens.
 *
 * UPDATE_TYPE_FULL is slow. Really slow. This requires us to recalculate our
 * bandwidth and clocks, possibly rearrange some pipes and reprogram
anything front
 * end related. Any time viewport dimensions, recout dimensions,
scaling ratios or
 * gamma need to be adjusted or pipe needs to be turned on (or
disconnected) we do
 * a full update. This cannot be done at ISR level and should be a rare event.
 * Unless someone is stress testing mpo enter/exit, playing with
colour or adjusting
 * underscan we don't expect to see this call at all.
 */
enum surface_update_type  
UPDATE_TYPE_FAST, /* super fast, safe to execute in isr */
UPDATE_TYPE_MED,  /* ISR safe, most of programming needed, no bw/clk change*/
UPDATE_TYPE_FULL, /* may need to shuffle resources */
 ;

Using tools: Observe the current state, validate your findings, continue improvements. 14) Use AMD tools to check hardware state and driver programming: help on understanding your driver settings and checking the behavior when changing those settings.
  • DC Visual confirmation: Check multiple planes and pipe split policy.
  • DTN logs: Check display hardware state, including rotation, size, format, underflow, blocks in use, color block values, etc.
  • UMR: Check ASIC info, register values, KMS state - links and elements (framebuffers, planes, CRTCs, connectors). Source: UMR project documentation
15) Use generic DRM/KMS tools:
  • IGT test tools: Use generic KMS tests or develop your own to isolate the issue in the kernel space. Compare results across different GPU vendors to understand their implementations and find potential solutions. Here AMD also has specific IGT tests for its GPUs that is expect to work without failures on any AMD GPU. You can check results of HW-specific tests using different display hardware families or you can compare expected differences between the generic workflow and AMD workflow.
  • drm_info: This tool summarizes the current state of a display driver (capabilities, properties and formats) per element of the DRM/KMS workflow. Output can be helpful when reporting bugs.

Don t give up! Debugging issues in the AMD display driver can be challenging, but by following these tips and leveraging available resources, you can significantly improve your chances of success. Worth mentioning: This blog post builds upon my talk, I m not an AMD expert, but presented at the 2022 XDC. It shares guidelines that helped me debug AMD display issues as an external developer of the driver. Open Source Display Driver: The Linux kernel/AMD display driver is open source, allowing you to actively contribute by addressing issues listed in the official tracker. Tackling existing issues or resolving your own can be a rewarding learning experience and a valuable contribution to the community. Additionally, the tracker serves as a valuable resource for finding similar bugs, troubleshooting tips, and suggestions from AMD developers. Finally, it s a platform for seeking help when needed. Remember, contributing to the open source community through issue resolution and collaboration is mutually beneficial for everyone involved.

21 November 2023

Mike Hommey: How I (kind of) killed Mercurial at Mozilla

Did you hear the news? Firefox development is moving from Mercurial to Git. While the decision is far from being mine, and I was barely involved in the small incremental changes that ultimately led to this decision, I feel I have to take at least some responsibility. And if you are one of those who would rather use Mercurial than Git, you may direct all your ire at me. But let's take a step back and review the past 25 years leading to this decision. You'll forgive me for skipping some details and any possible inaccuracies. This is already a long post, while I could have been more thorough, even I think that would have been too much. This is also not an official Mozilla position, only my personal perception and recollection as someone who was involved at times, but mostly an observer from a distance. From CVS to DVCS From its release in 1998, the Mozilla source code was kept in a CVS repository. If you're too young to know what CVS is, let's just say it's an old school version control system, with its set of problems. Back then, it was mostly ubiquitous in the Open Source world, as far as I remember. In the early 2000s, the Subversion version control system gained some traction, solving some of the problems that came with CVS. Incidentally, Subversion was created by Jim Blandy, who now works at Mozilla on completely unrelated matters. In the same period, the Linux kernel development moved from CVS to Bitkeeper, which was more suitable to the distributed nature of the Linux community. BitKeeper had its own problem, though: it was the opposite of Open Source, but for most pragmatic people, it wasn't a real concern because free access was provided. Until it became a problem: someone at OSDL developed an alternative client to BitKeeper, and licenses of BitKeeper were rescinded for OSDL members, including Linus Torvalds (they were even prohibited from purchasing one). Following this fiasco, in April 2005, two weeks from each other, both Git and Mercurial were born. The former was created by Linus Torvalds himself, while the latter was developed by Olivia Mackall, who was a Linux kernel developer back then. And because they both came out of the same community for the same needs, and the same shared experience with BitKeeper, they both were similar distributed version control systems. Interestingly enough, several other DVCSes existed: In this landscape, the major difference Git was making at the time was that it was blazing fast. Almost incredibly so, at least on Linux systems. That was less true on other platforms (especially Windows). It was a game-changer for handling large codebases in a smooth manner. Anyways, two years later, in 2007, Mozilla decided to move its source code not to Bzr, not to Git, not to Subversion (which, yes, was a contender), but to Mercurial. The decision "process" was laid down in two rather colorful blog posts. My memory is a bit fuzzy, but I don't recall that it was a particularly controversial choice. All of those DVCSes were still young, and there was no definite "winner" yet (GitHub hadn't even been founded). It made the most sense for Mozilla back then, mainly because the Git experience on Windows still wasn't there, and that mattered a lot for Mozilla, with its diverse platform support. As a contributor, I didn't think much of it, although to be fair, at the time, I was mostly consuming the source tarballs. Personal preferences Digging through my archives, I've unearthed a forgotten chapter: I did end up setting up both a Mercurial and a Git mirror of the Firefox source repository on alioth.debian.org. Alioth.debian.org was a FusionForge-based collaboration system for Debian developers, similar to SourceForge. It was the ancestor of salsa.debian.org. I used those mirrors for the Debian packaging of Firefox (cough cough Iceweasel). The Git mirror was created with hg-fast-export, and the Mercurial mirror was only a necessary step in the process. By that time, I had converted my Subversion repositories to Git, and switched off SVK. Incidentally, I started contributing to Git around that time as well. I apparently did this not too long after Mozilla switched to Mercurial. As a Linux user, I think I just wanted the speed that Mercurial was not providing. Not that Mercurial was that slow, but the difference between a couple seconds and a couple hundred milliseconds was a significant enough difference in user experience for me to prefer Git (and Firefox was not the only thing I was using version control for) Other people had also similarly created their own mirror, or with other tools. But none of them were "compatible": their commit hashes were different. Hg-git, used by the latter, was putting extra information in commit messages that would make the conversion differ, and hg-fast-export would just not be consistent with itself! My mirror is long gone, and those have not been updated in more than a decade. I did end up using Mercurial, when I got commit access to the Firefox source repository in April 2010. I still kept using Git for my Debian activities, but I now was also using Mercurial to push to the Mozilla servers. I joined Mozilla as a contractor a few months after that, and kept using Mercurial for a while, but as a, by then, long time Git user, it never really clicked for me. It turns out, the sentiment was shared by several at Mozilla. Git incursion In the early 2010s, GitHub was becoming ubiquitous, and the Git mindshare was getting large. Multiple projects at Mozilla were already entirely hosted on GitHub. As for the Firefox source code base, Mozilla back then was kind of a Wild West, and engineers being engineers, multiple people had been using Git, with their own inconvenient workflows involving a local Mercurial clone. The most popular set of scripts was moz-git-tools, to incorporate changes in a local Git repository into the local Mercurial copy, to then send to Mozilla servers. In terms of the number of people doing that, though, I don't think it was a lot of people, probably a few handfuls. On my end, I was still keeping up with Mercurial. I think at that time several engineers had their own unofficial Git mirrors on GitHub, and later on Ehsan Akhgari provided another mirror, with a twist: it also contained the full CVS history, which the canonical Mercurial repository didn't have. This was particularly interesting for engineers who needed to do some code archeology and couldn't get past the 2007 cutoff of the Mercurial repository. I think that mirror ultimately became the official-looking, but really unofficial, mozilla-central repository on GitHub. On a side note, a Mercurial repository containing the CVS history was also later set up, but that didn't lead to something officially supported on the Mercurial side. Some time around 2011~2012, I started to more seriously consider using Git for work myself, but wasn't satisfied with the workflows others had set up for themselves. I really didn't like the idea of wasting extra disk space keeping a Mercurial clone around while using a Git mirror. I wrote a Python script that would use Mercurial as a library to access a remote repository and produce a git-fast-import stream. That would allow the creation of a git repository without a local Mercurial clone. It worked quite well, but it was not able to incrementally update. Other, more complete tools existed already, some of which I mentioned above. But as time was passing and the size and depth of the Mercurial repository was growing, these tools were showing their limits and were too slow for my taste, especially for the initial clone. Boot to Git In the same time frame, Mozilla ventured in the Mobile OS sphere with Boot to Gecko, later known as Firefox OS. What does that have to do with version control? The needs of third party collaborators in the mobile space led to the creation of what is now the gecko-dev repository on GitHub. As I remember it, it was challenging to create, but once it was there, Git users could just clone it and have a working, up-to-date local copy of the Firefox source code and its history... which they could already have, but this was the first officially supported way of doing so. Coincidentally, Ehsan's unofficial mirror was having trouble (to the point of GitHub closing the repository) and was ultimately shut down in December 2013. You'll often find comments on the interwebs about how GitHub has become unreliable since the Microsoft acquisition. I can't really comment on that, but if you think GitHub is unreliable now, rest assured that it was worse in its beginning. And its sustainability as a platform also wasn't a given, being a rather new player. So on top of having this official mirror on GitHub, Mozilla also ventured in setting up its own Git server for greater control and reliability. But the canonical repository was still the Mercurial one, and while Git users now had a supported mirror to pull from, they still had to somehow interact with Mercurial repositories, most notably for the Try server. Git slowly creeping in Firefox build tooling Still in the same time frame, tooling around building Firefox was improving drastically. For obvious reasons, when version control integration was needed in the tooling, Mercurial support was always a no-brainer. The first explicit acknowledgement of a Git repository for the Firefox source code, other than the addition of the .gitignore file, was bug 774109. It added a script to install the prerequisites to build Firefox on macOS (still called OSX back then), and that would print a message inviting people to obtain a copy of the source code with either Mercurial or Git. That was a precursor to current bootstrap.py, from September 2012. Following that, as far as I can tell, the first real incursion of Git in the Firefox source tree tooling happened in bug 965120. A few days earlier, bug 952379 had added a mach clang-format command that would apply clang-format-diff to the output from hg diff. Obviously, running hg diff on a Git working tree didn't work, and bug 965120 was filed, and support for Git was added there. That was in January 2014. A year later, when the initial implementation of mach artifact was added (which ultimately led to artifact builds), Git users were an immediate thought. But while they were considered, it was not to support them, but to avoid actively breaking their workflows. Git support for mach artifact was eventually added 14 months later, in March 2016. From gecko-dev to git-cinnabar Let's step back a little here, back to the end of 2014. My user experience with Mercurial had reached a level of dissatisfaction that was enough for me to decide to take that script from a couple years prior and make it work for incremental updates. That meant finding a way to store enough information locally to be able to reconstruct whatever the incremental updates would be relying on (guess why other tools hid a local Mercurial clone under hood). I got something working rather quickly, and after talking to a few people about this side project at the Mozilla Portland All Hands and seeing their excitement, I published a git-remote-hg initial prototype on the last day of the All Hands. Within weeks, the prototype gained the ability to directly push to Mercurial repositories, and a couple months later, was renamed to git-cinnabar. At that point, as a Git user, instead of cloning the gecko-dev repository from GitHub and switching to a local Mercurial repository whenever you needed to push to a Mercurial repository (i.e. the aforementioned Try server, or, at the time, for reviews), you could just clone and push directly from/to Mercurial, all within Git. And it was fast too. You could get a full clone of mozilla-central in less than half an hour, when at the time, other similar tools would take more than 10 hours (needless to say, it's even worse now). Another couple months later (we're now at the end of April 2015), git-cinnabar became able to start off a local clone of the gecko-dev repository, rather than clone from scratch, which could be time consuming. But because git-cinnabar and the tool that was updating gecko-dev weren't producing the same commits, this setup was cumbersome and not really recommended. For instance, if you pushed something to mozilla-central with git-cinnabar from a gecko-dev clone, it would come back with a different commit hash in gecko-dev, and you'd have to deal with the divergence. Eventually, in April 2020, the scripts updating gecko-dev were switched to git-cinnabar, making the use of gecko-dev alongside git-cinnabar a more viable option. Ironically(?), the switch occurred to ease collaboration with KaiOS (you know, the mobile OS born from the ashes of Firefox OS). Well, okay, in all honesty, when the need of syncing in both directions between Git and Mercurial (we only had ever synced from Mercurial to Git) came up, I nudged Mozilla in the direction of git-cinnabar, which, in my (biased but still honest) opinion, was the more reliable option for two-way synchronization (we did have regular conversion problems with hg-git, nothing of the sort has happened since the switch). One Firefox repository to rule them all For reasons I don't know, Mozilla decided to use separate Mercurial repositories as "branches". With the switch to the rapid release process in 2011, that meant one repository for nightly (mozilla-central), one for aurora, one for beta, and one for release. And with the addition of Extended Support Releases in 2012, we now add a new ESR repository every year. Boot to Gecko also had its own branches, and so did Fennec (Firefox for Mobile, before Android). There are a lot of them. And then there are also integration branches, where developer's work lands before being merged in mozilla-central (or backed out if it breaks things), always leaving mozilla-central in a (hopefully) good state. Only one of them remains in use today, though. I can only suppose that the way Mercurial branches work was not deemed practical. It is worth noting, though, that Mercurial branches are used in some cases, to branch off a dot-release when the next major release process has already started, so it's not a matter of not knowing the feature exists or some such. In 2016, Gregory Szorc set up a new repository that would contain them all (or at least most of them), which eventually became what is now the mozilla-unified repository. This would e.g. simplify switching between branches when necessary. 7 years later, for some reason, the other "branches" still exist, but most developers are expected to be using mozilla-unified. Mozilla's CI also switched to using mozilla-unified as base repository. Honestly, I'm not sure why the separate repositories are still the main entry point for pushes, rather than going directly to mozilla-unified, but it probably comes down to switching being work, and not being a top priority. Also, it probably doesn't help that working with multiple heads in Mercurial, even (especially?) with bookmarks, can be a source of confusion. To give an example, if you aren't careful, and do a plain clone of the mozilla-unified repository, you may not end up on the latest mozilla-central changeset, but rather, e.g. one from beta, or some other branch, depending which one was last updated. Hosting is simple, right? Put your repository on a server, install hgweb or gitweb, and that's it? Maybe that works for... Mercurial itself, but that repository "only" has slightly over 50k changesets and less than 4k files. Mozilla-central has more than an order of magnitude more changesets (close to 700k) and two orders of magnitude more files (more than 700k if you count the deleted or moved files, 350k if you count the currently existing ones). And remember, there are a lot of "duplicates" of this repository. And I didn't even mention user repositories and project branches. Sure, it's a self-inflicted pain, and you'd think it could probably(?) be mitigated with shared repositories. But consider the simple case of two repositories: mozilla-central and autoland. You make autoland use mozilla-central as a shared repository. Now, you push something new to autoland, it's stored in the autoland datastore. Eventually, you merge to mozilla-central. Congratulations, it's now in both datastores, and you'd need to clean-up autoland if you wanted to avoid the duplication. Now, you'd think mozilla-unified would solve these issues, and it would... to some extent. Because that wouldn't cover user repositories and project branches briefly mentioned above, which in GitHub parlance would be considered as Forks. So you'd want a mega global datastore shared by all repositories, and repositories would need to only expose what they really contain. Does Mercurial support that? I don't think so (okay, I'll give you that: even if it doesn't, it could, but that's extra work). And since we're talking about a transition to Git, does Git support that? You may have read about how you can link to a commit from a fork and make-pretend that it comes from the main repository on GitHub? At least, it shows a warning, now. That's essentially the architectural reason why. So the actual answer is that Git doesn't support it out of the box, but GitHub has some backend magic to handle it somehow (and hopefully, other things like Gitea, Girocco, Gitlab, etc. have something similar). Now, to come back to the size of the repository. A repository is not a static file. It's a server with which you negotiate what you have against what it has that you want. Then the server bundles what you asked for based on what you said you have. Or in the opposite direction, you negotiate what you have that it doesn't, you send it, and the server incorporates what you sent it. Fortunately the latter is less frequent and requires authentication. But the former is more frequent and CPU intensive. Especially when pulling a large number of changesets, which, incidentally, cloning is. "But there is a solution for clones" you might say, which is true. That's clonebundles, which offload the CPU intensive part of cloning to a single job scheduled regularly. Guess who implemented it? Mozilla. But that only covers the cloning part. We actually had laid the ground to support offloading large incremental updates and split clones, but that never materialized. Even with all that, that still leaves you with a server that can display file contents, diffs, blames, provide zip archives of a revision, and more, all of which are CPU intensive in their own way. And these endpoints are regularly abused, and cause extra load to your servers, yes plural, because of course a single server won't handle the load for the number of users of your big repositories. And because your endpoints are abused, you have to close some of them. And I'm not mentioning the Try repository with its tens of thousands of heads, which brings its own sets of problems (and it would have even more heads if we didn't fake-merge them once in a while). Of course, all the above applies to Git (and it only gained support for something akin to clonebundles last year). So, when the Firefox OS project was stopped, there wasn't much motivation to continue supporting our own Git server, Mercurial still being the official point of entry, and git.mozilla.org was shut down in 2016. The growing difficulty of maintaining the status quo Slowly, but steadily in more recent years, as new tooling was added that needed some input from the source code manager, support for Git was more and more consistently added. But at the same time, as people left for other endeavors and weren't necessarily replaced, or more recently with layoffs, resources allocated to such tooling have been spread thin. Meanwhile, the repository growth didn't take a break, and the Try repository was becoming an increasing pain, with push times quite often exceeding 10 minutes. The ongoing work to move Try pushes to Lando will hide the problem under the rug, but the underlying problem will still exist (although the last version of Mercurial seems to have improved things). On the flip side, more and more people have been relying on Git for Firefox development, to my own surprise, as I didn't really push for that to happen. It just happened organically, by ways of git-cinnabar existing, providing a compelling experience to those who prefer Git, and, I guess, word of mouth. I was genuinely surprised when I recently heard the use of Git among moz-phab users had surpassed a third. I did, however, occasionally orient people who struggled with Mercurial and said they were more familiar with Git, towards git-cinnabar. I suspect there's a somewhat large number of people who never realized Git was a viable option. But that, on its own, can come with its own challenges: if you use git-cinnabar without being backed by gecko-dev, you'll have a hard time sharing your branches on GitHub, because you can't push to a fork of gecko-dev without pushing your entire local repository, as they have different commit histories. And switching to gecko-dev when you weren't already using it requires some extra work to rebase all your local branches from the old commit history to the new one. Clone times with git-cinnabar have also started to go a little out of hand in the past few years, but this was mitigated in a similar manner as with the Mercurial cloning problem: with static files that are refreshed regularly. Ironically, that made cloning with git-cinnabar faster than cloning with Mercurial. But generating those static files is increasingly time-consuming. As of writing, generating those for mozilla-unified takes close to 7 hours. I was predicting clone times over 10 hours "in 5 years" in a post from 4 years ago, I wasn't too far off. With exponential growth, it could still happen, although to be fair, CPUs have improved since. I will explore the performance aspect in a subsequent blog post, alongside the upcoming release of git-cinnabar 0.7.0-b1. I don't even want to check how long it now takes with hg-git or git-remote-hg (they were already taking more than a day when git-cinnabar was taking a couple hours). I suppose it's about time that I clarify that git-cinnabar has always been a side-project. It hasn't been part of my duties at Mozilla, and the extent to which Mozilla supports git-cinnabar is in the form of taskcluster workers on the community instance for both git-cinnabar CI and generating those clone bundles. Consequently, that makes the above git-cinnabar specific issues a Me problem, rather than a Mozilla problem. Taking the leap I can't talk for the people who made the proposal to move to Git, nor for the people who put a green light on it. But I can at least give my perspective. Developers have regularly asked why Mozilla was still using Mercurial, but I think it was the first time that a formal proposal was laid out. And it came from the Engineering Workflow team, responsible for issue tracking, code reviews, source control, build and more. It's easy to say "Mozilla should have chosen Git in the first place", but back in 2007, GitHub wasn't there, Bitbucket wasn't there, and all the available options were rather new (especially compared to the then 21 years-old CVS). I think Mozilla made the right choice, all things considered. Had they waited a couple years, the story might have been different. You might say that Mozilla stayed with Mercurial for so long because of the sunk cost fallacy. I don't think that's true either. But after the biggest Mercurial repository hosting service turned off Mercurial support, and the main contributor to Mercurial going their own way, it's hard to ignore that the landscape has evolved. And the problems that we regularly encounter with the Mercurial servers are not going to get any better as the repository continues to grow. As far as I know, all the Mercurial repositories bigger than Mozilla's are... not using Mercurial. Google has its own closed-source server, and Facebook has another of its own, and it's not really public either. With resources spread thin, I don't expect Mozilla to be able to continue supporting a Mercurial server indefinitely (although I guess Octobus could be contracted to give a hand, but is that sustainable?). Mozilla, being a champion of Open Source, also doesn't live in a silo. At some point, you have to meet your contributors where they are. And the Open Source world is now majoritarily using Git. I'm sure the vast majority of new hires at Mozilla in the past, say, 5 years, know Git and have had to learn Mercurial (although they arguably didn't need to). Even within Mozilla, with thousands(!) of repositories on GitHub, Firefox is now actually the exception rather than the norm. I should even actually say Desktop Firefox, because even Mobile Firefox lives on GitHub (although Fenix is moving back in together with Desktop Firefox, and the timing is such that that will probably happen before Firefox moves to Git). Heck, even Microsoft moved to Git! With a significant developer base already using Git thanks to git-cinnabar, and all the constraints and problems I mentioned previously, it actually seems natural that a transition (finally) happens. However, had git-cinnabar or something similarly viable not existed, I don't think Mozilla would be in a position to take this decision. On one hand, it probably wouldn't be in the current situation of having to support both Git and Mercurial in the tooling around Firefox, nor the resource constraints related to that. But on the other hand, it would be farther from supporting Git and being able to make the switch in order to address all the other problems. But... GitHub? I hope I made a compelling case that hosting is not as simple as it can seem, at the scale of the Firefox repository. It's also not Mozilla's main focus. Mozilla has enough on its plate with the migration of existing infrastructure that does rely on Mercurial to understandably not want to figure out the hosting part, especially with limited resources, and with the mixed experience hosting both Mercurial and git has been so far. After all, GitHub couldn't even display things like the contributors' graph on gecko-dev until recently, and hosting is literally their job! They still drop the ball on large blames (thankfully we have searchfox for those). Where does that leave us? Gitlab? For those criticizing GitHub for being proprietary, that's probably not open enough. Cloud Source Repositories? "But GitHub is Microsoft" is a complaint I've read a lot after the announcement. Do you think Google hosting would have appealed to these people? Bitbucket? I'm kind of surprised it wasn't in the list of providers that were considered, but I'm also kind of glad it wasn't (and I'll leave it at that). I think the only relatively big hosting provider that could have made the people criticizing the choice of GitHub happy is Codeberg, but I hadn't even heard of it before it was mentioned in response to Mozilla's announcement. But really, with literal thousands of Mozilla repositories already on GitHub, with literal tens of millions repositories on the platform overall, the pragmatic in me can't deny that it's an attractive option (and I can't stress enough that I wasn't remotely close to the room where the discussion about what choice to make happened). "But it's a slippery slope". I can see that being a real concern. LLVM also moved its repository to GitHub (from a (I think) self-hosted Subversion server), and ended up moving off Bugzilla and Phabricator to GitHub issues and PRs four years later. As an occasional contributor to LLVM, I hate this move. I hate the GitHub review UI with a passion. At least, right now, GitHub PRs are not a viable option for Mozilla, for their lack of support for security related PRs, and the more general shortcomings in the review UI. That doesn't mean things won't change in the future, but let's not get too far ahead of ourselves. The move to Git has just been announced, and the migration has not even begun yet. Just because Mozilla is moving the Firefox repository to GitHub doesn't mean it's locked in forever or that all the eggs are going to be thrown into one basket. If bridges need to be crossed in the future, we'll see then. So, what's next? The official announcement said we're not expecting the migration to really begin until six months from now. I'll swim against the current here, and say this: the earlier you can switch to git, the earlier you'll find out what works and what doesn't work for you, whether you already know Git or not. While there is not one unique workflow, here's what I would recommend anyone who wants to take the leap off Mercurial right now: As there is no one-size-fits-all workflow, I won't tell you how to organize yourself from there. I'll just say this: if you know the Mercurial sha1s of your previous local work, you can create branches for them with:
$ git branch <branch_name> $(git cinnabar hg2git <hg_sha1>)
At this point, you should have everything available on the Git side, and you can remove the .hg directory. Or move it into some empty directory somewhere else, just in case. But don't leave it here, it will only confuse the tooling. Artifact builds WILL be confused, though, and you'll have to ./mach configure before being able to do anything. You may also hit bug 1865299 if your working tree is older than this post. If you have any problem or question, you can ping me on #git-cinnabar or #git on Matrix. I'll put the instructions above somewhere on wiki.mozilla.org, and we can collaboratively iterate on them. Now, what the announcement didn't say is that the Git repository WILL NOT be gecko-dev, doesn't exist yet, and WON'T BE COMPATIBLE (trust me, it'll be for the better). Why did I make you do all the above, you ask? Because that won't be a problem. I'll have you covered, I promise. The upcoming release of git-cinnabar 0.7.0-b1 will have a way to smoothly switch between gecko-dev and the future repository (incidentally, that will also allow to switch from a pure git-cinnabar clone to a gecko-dev one, for the git-cinnabar users who have kept reading this far). What about git-cinnabar? With Mercurial going the way of the dodo at Mozilla, my own need for git-cinnabar will vanish. Legitimately, this begs the question whether it will still be maintained. I can't answer for sure. I don't have a crystal ball. However, the needs of the transition itself will motivate me to finish some long-standing things (like finalizing the support for pushing merges, which is currently behind an experimental flag) or implement some missing features (support for creating Mercurial branches). Git-cinnabar started as a Python script, it grew a sidekick implemented in C, which then incorporated some Rust, which then cannibalized the Python script and took its place. It is now close to 90% Rust, and 10% C (if you don't count the code from Git that is statically linked to it), and has sort of become my Rust playground (it's also, I must admit, a mess, because of its history, but it's getting better). So the day to day use with Mercurial is not my sole motivation to keep developing it. If it were, it would stay stagnant, because all the features I need are there, and the speed is not all that bad, although I know it could be better. Arguably, though, git-cinnabar has been relatively stagnant feature-wise, because all the features I need are there. So, no, I don't expect git-cinnabar to die along Mercurial use at Mozilla, but I can't really promise anything either. Final words That was a long post. But there was a lot of ground to cover. And I still skipped over a bunch of things. I hope I didn't bore you to death. If I did and you're still reading... what's wrong with you? ;) So this is the end of Mercurial at Mozilla. So long, and thanks for all the fish. But this is also the beginning of a transition that is not easy, and that will not be without hiccups, I'm sure. So fasten your seatbelts (plural), and welcome the change. To circle back to the clickbait title, did I really kill Mercurial at Mozilla? Of course not. But it's like I stumbled upon a few sparks and tossed a can of gasoline on them. I didn't start the fire, but I sure made it into a proper bonfire... and now it has turned into a wildfire. And who knows? 15 years from now, someone else might be looking back at how Mozilla picked Git at the wrong time, and that, had we waited a little longer, we would have picked some yet to come new horse. But hey, that's the tech cycle for you.

20 November 2023

Russ Allbery: Review: The Exiled Fleet

Review: The Exiled Fleet, by J.S. Dewes
Series: Divide #2
Publisher: Tor
Copyright: 2021
ISBN: 1-250-23635-5
Format: Kindle
Pages: 421
The Exiled Fleet is far-future interstellar military SF. It is a direct sequel to The Last Watch. You don't want to start here. The Last Watch took a while to get going, but it ended with some fascinating world-building and a suitably enormous threat. I was hoping Dewes would carry that momentum into the second book. I was disappointed; instead, The Exiled Fleet starts with interpersonal angst and wallowing and takes an annoyingly long time to build up narrative tension again. The world-building of the first book looked outward, towards aliens and strange technology and stranger physics, while setting up contributing problems on the home front. The Exiled Fleet pivots inwards, both in terms of world-building and in terms of character introspection. Neither of those worked as well for me. There's nothing wrong with the revelations here about human power structures and the politics that the Sentinels have been missing at the edge of space, but it also felt like a classic human autocracy without much new to offer in either wee thinky bits or plot structure. We knew most of shape from the start of the first book: Cavalon's grandfather is evil, human society is run as an oligarchy, and everything is trending authoritarian. Once the action started, I was entertained but not gripped the way that I was when reading The Last Watch. Dewes makes a brief attempt to tap into the morally complex question of the military serving as a brake on tyranny, but then does very little with it. Instead, everything is excessively personal, turning the political into less of a confrontation of ideologies or ethics and more a story of family abuse and rebellion. There is even more psychodrama in this book than there was in the previous book. I found it exhausting. Rake is barely functional after the events of the previous book and pushing herself way too hard at the start of this one. Cavalon regresses considerably and starts falling apart again. There's a lot of moping, a lot of angst, and a lot of characters berating themselves and occasionally each other. It was annoying enough that I took a couple of weeks break from this book in the middle before I could work up the enthusiasm to finish it. Some of this is personal preference. My favorite type of story is competence porn: details about something esoteric and satisfyingly complex, a challenge to overcome, and a main character who deploys their expertise to overcome that challenge in a way that shows they generally have their shit together. I can enjoy other types of stories, but that's the story I'll keep reaching for. Other people prefer stories about fuck-ups and walking disasters, people who barely pull together enough to survive the plot (or sometimes not even that). There's nothing wrong with that, and neither approach is right or wrong, but my tolerance for that story is usually lot lower. I think Dewes is heading towards the type of story in which dysfunctional characters compensate for each other's flaws in order to keep each other going, and intellectually I can see the appeal. But it's not my thing, and when the main characters are falling apart and the supporting characters project considerably more competence, I wish the story had different protagonists. It didn't help that this is in theory military SF, but Dewes does not seem to want to deploy any of the support framework of the military to address any of her characters' problems. This book is a lot of Rake and Cavalon dragging each other through emotional turmoil while coming to terms with Cavalon's family. I liked their dynamic in the first book when it felt more like Rake showing leadership skills. Here, it turns into something closer to found family in ways that seemed wildly inconsistent with the military structure, and while I'm normally not one to defend hierarchical discipline, I felt like Rake threw out the only structure she had to handle the thousands of other people under her command and started winging it based on personal friendship. If this were a small commercial crew, sure, fine, but Rake has a personal command responsibility that she obsessively angsts about and yet keeps abandoning. I realize this is probably another way to complain that I wanted competence porn and got barely-functional fuck-ups. The best parts of this series are the strange technologies and the aliens, and they are again the best part of this book. There was a truly great moment involving Viator technology that I found utterly delightful, and there was an intriguing setup for future books that caught my attention. Unfortunately, there were also a lot of deus ex machina solutions to problems, both from convenient undisclosed character backstories and from alien tech. I felt like the characters had to work satisfyingly hard for their victories in the first book; here, I felt like Dewes kept having issues with her characters being at point A and her plot at point B and pulling some rabbit out of the hat to make the plot work. This unfortunately undermined the cool factor of the world-building by making its plot device aspects a bit too obvious. This series also turns out not to be a duology (I have no idea why I thought it would be). By the end of The Exiled Fleet, none of the major political or world-building problems have been resolved. At best, the characters are in a more stable space to start being proactive. I'm cautiously optimistic that could mean the series would turn into the type of story I was hoping for, but I'm worried that Dewes is interested in writing a different type of character story than I am interested in reading. Hopefully there will be some clues in the synopsis of the (as yet unannounced) third book. I thought The Last Watch had some first-novel problems but was worth reading. I am much more reluctant to recommend The Exiled Fleet, or the series as a whole given that it is incomplete. Unless you like dysfunctional characters, proceed with caution. Rating: 5 out of 10

11 November 2023

Matthias Klumpp: AppStream 1.0 released!

Today, 12 years after the meeting where AppStream was first discussed and 11 years after I released a prototype implementation I am excited to announce AppStream 1.0!    Check it out on GitHub, or get the release tarball or read the documentation or release notes!

Some nostalgic memories I was not in the original AppStream meeting, since in 2011 I was extremely busy with finals preparations and ball organization in high school, but I still vividly remember sitting at school in the students lounge during a break and trying to catch the really choppy live stream from the meeting on my borrowed laptop (a futile exercise, I watched parts of the blurry recording later). I was extremely passionate about getting software deployment to work better on Linux and to improve the overall user experience, and spent many hours on the PackageKit IRC channel discussing things with many amazing people like Richard Hughes, Daniel Nicoletti, Sebastian Heinlein and others. At the time I was writing a software deployment tool called Listaller this was before Linux containers were a thing, and building it was very tough due to technical and personal limitations (I had just learned C!). Then in university, when I intended to recreate this tool, but for real and better this time as a new project called Limba, I needed a way to provide metadata for it, and AppStream fit right in! Meanwhile, Richard Hughes was tackling the UI side of things while creating GNOME Software and needed a solution as well. So I implemented a prototype and together we pretty much reshaped the early specification from the original meeting into what would become modern AppStream. Back then I saw AppStream as a necessary side-project for my actual project, and didn t even consider me as the maintainer of it for quite a while (I hadn t been at the meeting afterall). All those years ago I had no idea that ultimately I was developing AppStream not for Limba, but for a new thing that would show up later, with an even more modern design called Flatpak. I also had no idea how incredibly complex AppStream would become and how many features it would have and how much more maintenance work it would be and also not how ubiquitous it would become. The modern Linux desktop uses AppStream everywhere now, it is supported by all major distributions, used by Flatpak for metadata, used for firmware metadata via Richard s fwupd/LVFS, runs on every Steam Deck, can be found in cars and possibly many places I do not know yet.

What is new in 1.0?

API breaks The most important thing that s new with the 1.0 release is a bunch of incompatible changes. For the shared libraries, all deprecated API elements have been removed and a bunch of other changes have been made to improve the overall API and especially make it more binding-friendly. That doesn t mean that the API is completely new and nothing looks like before though, when possible the previous API design was kept and some changes that would have been too disruptive have not been made. Regardless of that, you will have to port your AppStream-using applications. For some larger ones I already submitted patches to build with both AppStream versions, the 0.16.x stable series as well as 1.0+. For the XML specification, some older compatibility for XML that had no or very few users has been removed as well. This affects for example release elements that reference downloadable data without an artifact block, which has not been supported for a while. For all of these, I checked to remove only things that had close to no users and that were a significant maintenance burden. So as a rule of thumb: If your XML validated with no warnings with the 0.16.x branch of AppStream, it will still be 100% valid with the 1.0 release. Another notable change is that the generated output of AppStream 1.0 will always be 1.0 compliant, you can not make it generate data for versions below that (this greatly reduced the maintenance cost of the project).

Developer element For a long time, you could set the developer name using the top-level developer_name tag. With AppStream 1.0, this is changed a bit. There is now a developer tag with a name child (that can be translated unless the translate="no" attribute is set on it). This allows future extensibility, and also allows to set a machine-readable id attribute in the developer element. This permits software centers to group software by developer easier, without having to use heuristics. If we decide to extend the developer information per-app in future, this is also now possible. Do not worry though the developer_name tag is also still read, so there is no high pressure to update. The old 0.16.x stable series also has this feature backported, so it can be available everywhere. Check out the developer tag specification for more details.

Scale factor for screenshots Screenshot images can now have a scale attribute, to indicate an (integer) scaling factor to apply. This feature was a breaking change and therefore we could not have it for the longest time, but it is now available. Please wait a bit for AppStream 1.0 to become deployed more widespread though, as using it with older AppStream versions may lead to issues in some cases. Check out the screenshots tag specification for more details.

Screenshot environments It is now possible to indicate the environment a screenshot was recorded in (GNOME, GNOME Dark, KDE Plasma, Windows, etc.) via an environment attribute on the respective screenshot tag. This was also a breaking change, so use it carefully for now! If projects want to, they can use this feature to supply dedicated screenshots depending on the environment the application page is displayed in. Check out the screenshots tag specification for more details.

References tag This is a feature more important for the scientific community and scientific applications. Using the references tag, you can associate the AppStream component with a DOI (Digital object identifier) or provide a link to a CFF file to provide citation information. It also allows to link to other scientific registries. Check out the references tag specification for more details.

Release tags Releases can have tags now, just like components. This is generally not a feature that I expect to be used much, but in certain instances it can become useful with a cooperating software center, for example to tag certain releases as long-term supported versions.

Multi-platform support Thanks to the interest and work of many volunteers, AppStream (mostly) runs on FreeBSD now, a NetBSD port exists, support for macOS was written and a Windows port is on its way! Thank you to everyone working on this

Better compatibility checks For a long time I thought that the AppStream library should just be a thin layer above the XML and that software centers should just implement a lot of the actual logic. This has not been the case for a while, but there was still a lot of complex AppStream features that were hard for software centers to implement and where it makes sense to have one implementation that projects can just use. The validation of component relations is one such thing. This was implemented in 0.16.x as well, but 1.0 vastly improves upon the compatibility checks, so you can now just run as_component_check_relations and retrieve a detailed list of whether the current component will run well on the system. Besides better API for software developers, the appstreamcli utility also has much improved support for relation checks, and I wrote about these changes in a previous post. Check it out! With these changes, I hope this feature will be used much more, and beyond just drivers and firmware.

So much more! The changelog for the 1.0 release is huge, and there are many papercuts resolved and changes made that I did not talk about here, like us using gi-docgen (instead of gtkdoc) now for nice API documentation, or the many improvements that went into better binding support, or better search, or just plain bugfixes.

Outlook I expect the transition to 1.0 to take a bit of time. AppStream has not broken its API for many, many years (since 2016), so a bunch of places need to be touched even if the changes themselves are minor in many cases. In hindsight, I should have also released 1.0 much sooner and it should not have become such a mega-release, but that was mainly due to time constraints. So, what s in it for the future? Contrary to what I thought, AppStream does not really seem to be done and fetature complete at a point, there is always something to improve, and people come up with new usecases all the time. So, expect more of the same in future: Bugfixes, validator improvements, documentation improvements, better tools and the occasional new feature. Onwards to 1.0.1!

17 October 2023

Russ Allbery: Review: A Hat Full of Sky

Review: A Hat Full of Sky, by Terry Pratchett
Series: Discworld #32
Publisher: HarperTrophy
Copyright: 2004
Printing: 2005
ISBN: 0-06-058662-1
Format: Mass market
Pages: 407
A Hat Full of Sky is the 32nd Discworld novel and the second Tiffany Aching young adult novel. You should not start here, but you could start with The Wee Free Men. As with that book, some parts of the story carry more weight if you are already familiar with Granny Weatherwax. Tiffany is a witch, but she needs to be trained. This is normally done by apprenticeship, and in Tiffany's case it seemed wise to give her exposure to more types of witching. Thus, Tiffany, complete with new boots and a going-away present from the still-somewhat-annoying Roland, is off on an apprenticeship to the sensible Miss Level. (The new boots feel wrong and get swapped out for her concealed old boots at the first opportunity.) Unbeknownst to Tiffany, her precocious experiments with leaving her body as a convenient substitute for a mirror have attracted something very bad, something none of the witches are expecting. The Nac Mac Feegle know a hiver as soon as they feel it, but they have a new kelda now, and she's not sure she wants them racing off after their old kelda. Terry Pratchett is very good at a lot of things, but I don't think villains are one of his strengths. He manages an occasional memorable one (the Auditors, for example, at least before the whole chocolate thing), but I find most of them a bit boring. The hiver is one of the boring ones. It serves mostly as a concretized metaphor about the temptations of magical power, but those temptations felt so unlike the tendencies of Tiffany's personality that I didn't think the metaphor worked in the story. The interesting heart of this book to me is the conflict between Tiffany's impatience with nonsense and Miss Level's arguably excessive willingness to help everyone regardless of how demanding they get. There's something deeper in here about female socialization and how that interacts with Pratchett's conception of witches that got me thinking, although I don't think Pratchett landed the point with full force. Miss Level is clearly a good witch to her village and seems comfortable with how she lives her life, so perhaps they're not taking advantage of her, but she thoroughly slots herself into the helper role. If Tiffany attempted the same role, people would be taking advantage of her, because the role doesn't fit her. And yet, there's a lesson here she needs to learn about seeing other people as people, even if it wouldn't be healthy for her to move all the way to Miss Level's mindset. Tiffany is a precocious kid who is used to being underestimated, and who has reacted by becoming independent and somewhat judgmental. She's also had a taste of real magical power, which creates a risk of her getting too far into her own head. Miss Level is a fount of empathy and understanding for the normal people around her, which Tiffany resists and needed to learn. I think Granny Weatherwax is too much like Tiffany to teach her that. She also has no patience for fools, but she's older and wiser and knows Tiffany needs a push in that direction. Miss Level isn't a destination, but more of a counterbalance. That emotional journey, a conclusion that again focuses on the role of witches in questions of life and death, and Tiffany's fascinatingly spiky mutual respect with Granny Weatherwax were the best parts of this book for me. The middle section with the hiver was rather tedious and forgettable, and the Nac Mac Feegle were entertaining but not more than that. It felt like the story went in a few different directions and only some of them worked, in part because the villain intended to tie those pieces together was more of a force of nature than a piece of Tiffany's emotional puzzle. If the hiver had resonated with the darker parts of Tiffany's natural personality, the plot would have worked better. Pratchett was gesturing in that direction, but he never convinced me it was consistent with what we'd already seen of her. Like a lot of the Discworld novels, the good moments in A Hat Full of Sky are astonishing, but the plot is somewhat forgettable. It's still solidly entertaining, though, and if you enjoyed The Wee Free Men, I think this is slightly better. Followed by Going Postal in publication order. The next Tiffany Aching novel is Wintersmith. Rating: 8 out of 10

12 October 2023

Jonathan McDowell: Installing Debian on the BananaPi M2 Zero

My previously mentioned C.H.I.P. repurposing has been partly successful; I ve found a use for it (which I still need to write up), but unfortunately it s too useful and the fact it s still a bit flaky has become a problem. I spent a while trying to isolate exactly what the problem is (I m still seeing occasional hard hangs with no obvious debug output in the logs or on the serial console), then realised I should just buy one of the cheap ARM SBC boards currently available. The C.H.I.P. is based on an Allwinner R8, which is a single ARM v7 core (an A8). So it s fairly low power by today s standards and it seemed pretty much any board would probably do. I considered a Pi 2 Zero, but couldn t be bothered trying to find one in stock at a reasonable price (I ve had one on backorder from CPC since May 2022, and yes, I know other places have had them in stock since but I don t need one enough to chase and I m now mostly curious about whether it will ever ship). As the title of this post gives away, I settled on a Banana Pi BPI-M2 Zero, which is based on an Allwinner H3. That s a quad-core ARM v7 (an A7), so a bit more oompfh than the C.H.I.P. All in all it set me back 25, including a set of heatsinks that form a case around it. I started with the vendor provided Debian SD card image, which is based on Debian 9 (stretch) and so somewhat old. I was able to dist-upgrade my way through buster and bullseye, and end up on bookworm. I then discovered the bookworm 6.1 kernel worked just fine out of the box, and even included a suitable DTB. Which got me thinking about whether I could do a completely fresh Debian install with minimal tweaking. First thing, a boot loader. The Allwinner chips are nice in that they ll boot off SD, so I just needed a suitable u-boot image. Rather than go with the vendor image I had a look at mainline and discovered it had support! So let s build a clean image:
noodles@buildhost:~$ mkdir ~/BPI
noodles@buildhost:~$ cd ~/BPI
noodles@buildhost:~/BPI$ ls
noodles@buildhost:~/BPI$ git clone https://source.denx.de/u-boot/u-boot.git
Cloning into 'u-boot'...
remote: Enumerating objects: 935825, done.
remote: Counting objects: 100% (5777/5777), done.
remote: Compressing objects: 100% (1967/1967), done.
remote: Total 935825 (delta 3799), reused 5716 (delta 3769), pack-reused 930048
Receiving objects: 100% (935825/935825), 186.15 MiB   2.21 MiB/s, done.
Resolving deltas: 100% (785671/785671), done.
noodles@buildhost:~/BPI$ mkdir u-boot-build
noodles@buildhost:~/BPI$ cd u-boot
noodles@buildhost:~/BPI/u-boot$ git checkout v2023.07.02
...
HEAD is now at 83cdab8b2c Prepare v2023.07.02
noodles@buildhost:~/BPI/u-boot$ make O=../u-boot-build bananapi_m2_zero_defconfig
  HOSTCC  scripts/basic/fixdep
  GEN     Makefile
  HOSTCC  scripts/kconfig/conf.o
  YACC    scripts/kconfig/zconf.tab.c
  LEX     scripts/kconfig/zconf.lex.c
  HOSTCC  scripts/kconfig/zconf.tab.o
  HOSTLD  scripts/kconfig/conf
#
# configuration written to .config
#
make[1]: Leaving directory '/home/noodles/BPI/u-boot-build'
noodles@buildhost:~/BPI/u-boot$ cd ../u-boot-build/
noodles@buildhost:~/BPI/u-boot-build$ make CROSS_COMPILE=arm-linux-gnueabihf-
  GEN     Makefile
scripts/kconfig/conf  --syncconfig Kconfig
...
  LD      spl/u-boot-spl
  OBJCOPY spl/u-boot-spl-nodtb.bin
  COPY    spl/u-boot-spl.bin
  SYM     spl/u-boot-spl.sym
  MKIMAGE spl/sunxi-spl.bin
  MKIMAGE u-boot.img
  COPY    u-boot.dtb
  MKIMAGE u-boot-dtb.img
  BINMAN  .binman_stamp
  OFCHK   .config
noodles@buildhost:~/BPI/u-boot-build$ ls -l u-boot-sunxi-with-spl.bin
-rw-r--r-- 1 noodles noodles 494900 Aug  8 08:06 u-boot-sunxi-with-spl.bin
I had the advantage here of already having a host setup to cross build armhf binaries, but this was all done on a Debian bookworm host with packages from main. I ve put my build up here in case it s useful to someone - everything else below can be done on a normal x86_64 host. Next I needed a Debian installer. I went for the netboot variant - although I was writing it to SD rather than TFTP booting I wanted as much as possible to come over the network.
noodles@buildhost:~/BPI$ wget https://deb.debian.org/debian/dists/bookworm/main/installer-armhf/20230607%2Bdeb12u1/images/netboot/netboot.tar.gz
...
2023-08-08 10:15:03 (34.5 MB/s) -  netboot.tar.gz  saved [37851404/37851404]
noodles@buildhost:~/BPI$ tar -axf netboot.tar.gz
Then I took a suitable microSD card and set it up with a 500M primary VFAT partition, leaving the rest for Linux proper. I could have got away with a smaller VFAT partition but I d initially thought I might need to put some more installation files on it.
noodles@buildhost:~/BPI$ sudo fdisk /dev/sdb
Welcome to fdisk (util-linux 2.38.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): o
Created a new DOS (MBR) disklabel with disk identifier 0x793729b3.
Command (m for help): n
Partition type
   p   primary (0 primary, 0 extended, 4 free)
   e   extended (container for logical partitions)
Select (default p):
Using default response p.
Partition number (1-4, default 1):
First sector (2048-60440575, default 2048):
Last sector, +/-sectors or +/-size K,M,G,T,P  (2048-60440575, default 60440575): +500M
Created a new partition 1 of type 'Linux' and of size 500 MiB.
Command (m for help): t
Selected partition 1
Hex code or alias (type L to list all): c
Changed type of partition 'Linux' to 'W95 FAT32 (LBA)'.
Command (m for help): n
Partition type
   p   primary (1 primary, 0 extended, 3 free)
   e   extended (container for logical partitions)
Select (default p):
Using default response p.
Partition number (2-4, default 2):
First sector (1026048-60440575, default 1026048):
Last sector, +/-sectors or +/-size K,M,G,T,P  (534528-60440575, default 60440575):
Created a new partition 2 of type 'Linux' and of size 28.3 GiB.
Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
$ sudo mkfs -t vfat -n BPI-UBOOT /dev/sdb1
mkfs.fat 4.2 (2021-01-31)
The bootloader image gets written 8k into the SD card (our first partition starts at sector 2048, i.e. 1M into the device, so there s plenty of space here):
noodles@buildhost:~/BPI$ sudo dd if=u-boot-build/u-boot-sunxi-with-spl.bin of=/dev/sdb bs=1024 seek=8
483+1 records in
483+1 records out
494900 bytes (495 kB, 483 KiB) copied, 0.0282234 s, 17.5 MB/s
Copy the Debian installer files onto the VFAT partition:
noodles@buildhost:~/BPI$ cp -r debian-installer/ /media/noodles/BPI-UBOOT/
Unmount the SD from the build host, pop it into the M2 Zero, boot it up while connected to the serial console, hit a key to stop autoboot and tell it to boot the installer:
U-Boot SPL 2023.07.02 (Aug 08 2023 - 09:05:44 +0100)
DRAM: 512 MiB
Trying to boot from MMC1
U-Boot 2023.07.02 (Aug 08 2023 - 09:05:44 +0100) Allwinner Technology
CPU:   Allwinner H3 (SUN8I 1680)
Model: Banana Pi BPI-M2-Zero
DRAM:  512 MiB
Core:  60 devices, 17 uclasses, devicetree: separate
WDT:   Not starting watchdog@1c20ca0
MMC:   mmc@1c0f000: 0, mmc@1c10000: 1
Loading Environment from FAT... Unable to read "uboot.env" from mmc0:1...
In:    serial
Out:   serial
Err:   serial
Net:   No ethernet found.
Hit any key to stop autoboot:  0
=> setenv dibase /debian-installer/armhf
=> fatload mmc 0:1 $ kernel_addr_r  $ dibase /vmlinuz
5333504 bytes read in 225 ms (22.6 MiB/s)
=> setenv bootargs "console=ttyS0,115200n8"
=> fatload mmc 0:1 $ fdt_addr_r  $ dibase /dtbs/sun8i-h2-plus-bananapi-m2-zero.dtb
25254 bytes read in 7 ms (3.4 MiB/s)
=> fdt addr $ fdt_addr_r  0x40000
Working FDT set to 43000000
=> fatload mmc 0:1 $ ramdisk_addr_r  $ dibase /initrd.gz
31693887 bytes read in 1312 ms (23 MiB/s)
=> bootz $ kernel_addr_r  $ ramdisk_addr_r :$ filesize  $ fdt_addr_r 
Kernel image @ 0x42000000 [ 0x000000 - 0x516200 ]
## Flattened Device Tree blob at 43000000
   Booting using the fdt blob at 0x43000000
Working FDT set to 43000000
   Loading Ramdisk to 481c6000, end 49fffc3f ... OK
   Loading Device Tree to 48183000, end 481c5fff ... OK
Working FDT set to 48183000
Starting kernel ...
At this point the installer runs and you can do a normal install. Well, except the wifi wasn t detected, I think because the netinst images don t include firmware. I spent a bit of time trying to figure out how to include it but ultimately ended up installing over a USB ethernet dongle, which Just Worked and was less faff. Installing firmware-brcm80211 once installation completed allowed the built-in wifi to work fine. After install you need to configure u-boot to boot without intervention. At the u-boot prompt (i.e. after hitting a key to stop autoboot):
=> setenv bootargs "console=ttyS0,115200n8 root=LABEL=BPI-ROOT ro"
=> setenv bootcmd 'ext4load mmc 0:2 $ fdt_addr_r  /boot/sun8i-h2-plus-bananapi-m2-zero.dtb ; fdt addr $ fdt_addr_r  0x40000 ; ext4load mmc 0:2 $ kernel_addr_r  /boot/vmlinuz ; ext4load mmc 0:2 $ ramdisk_addr_r  /boot/initrd.img ; bootz $ kernel_addr_r  $ ramdisk_addr_r :$ filesize  $ fdt_addr_r '
=> saveenv
Saving Environment to FAT... OK
=> reset
This is assuming you have /boot on partition 2 on the SD - I left the first partition as VFAT (that s where the u-boot environment will be saved) and just used all of the rest as a single ext4 partition. I did have to do an e2label /dev/sdb2 BPI-ROOT to label / appropriately; otherwise I occasionally saw the SD card appear as mmc1 for Linux (I m guessing due to asynchronous boot order with the wifi). You should now find the device boots without intervention.

10 October 2023

Russ Allbery: Review: Chilling Effect

Review: Chilling Effect, by Valerie Valdes
Series: Chilling Effect #1
Publisher: Harper Voyager
Copyright: September 2019
Printing: 2020
ISBN: 0-06-287724-0
Format: Kindle
Pages: 420
Chilling Effect is a space opera, kind of; more on the genre classification in a moment. It is the first volume of a series, although it reaches a reasonable conclusion on its own. It was Valerie Valdes's first novel. Captain Eva Innocente's line of work used to be less than lawful, following in the footsteps of her father. She got out of that life and got her own crew and ship. Now, the La Sirena Negra and its crew do small transport jobs for just enough money to stay afloat. Or, maybe, a bit less than that, when the recipient of a crate full of psychic escape-artist cats goes bankrupt before she can deliver it and get paid. It's a marginal and tenuous life, but at least she isn't doing anything shady. Then the Fridge kidnaps her sister. The Fridge is a shadowy organization of extortionists whose modus operandi is to kidnap a family member of their target, stuff them in cryogenic suspension, and demand obedience lest the family member be sold off as indentured labor after a few decades as a popsicle. Eva will be given missions that she and her crew have to perform. If she performs them well, she will pay off the price of her sister's release. Eventually. Oh, and she's not allowed to tell anyone. I found it hard to place the subgenre of this novel more specifically than comedy-adventure. The technology fits space opera: there are psychic cats, pilots who treat ships as extensions of their own body, brain parasites, a random intergalactic warlord, and very few attempts to explain anything with scientific principles. However, the stakes aren't on the scale that space opera usually goes for. Eva and her crew aren't going to topple governments or form rebellions. They're just trying to survive in a galaxy full of abusive corporations, dodgy clients, and the occasional alien who requires you to carry extensive documentation to prove that you can't be hunted for meat. It is also, as you might guess from that description, occasionally funny. That part of the book didn't mesh for me. Eva is truly afraid for her sister, and some of the events in the book are quite sinister, but the antagonist is an organization called The Fridge that puts people in fridges. Sexual harassment in a bar turns into obsessive stalking by a crazed intergalactic warlord who frequently interrupts the plot by randomly blasting things with his fleet, which felt like something from Hitchhiker's Guide to the Galaxy. The stakes for Eva, and her frustrations at being dragged back into a life she escaped, felt too high for the wacky, comic descriptions of the problems she gets into. My biggest complaint, though, is that the plot is driven by people not telling other people critical information they should know. Eva is keeping major secrets from her crew for nearly the entire book. Other people are also keeping information from Eva. There is a romance subplot driven almost entirely by both parties refusing to talk to each other about the existence of a romance subplot. For some people, this is catnip, but it's one of my least favorite fictional tropes and I found much of the book both frustrating and stressful. Fictional characters keeping important secrets from each other apparently raises my blood pressure. One of the things I did like about this book is that Eva is Hispanic and speaks like it. She resorts to Spanish frequently for curses, untranslatable phrases, aphorisms, derogatory comments, and similar types of emotional communication that don't feel right in a second language. Most of the time one can figure out the meaning from context, but Valdes doesn't feel obligated to hold the reader's hand and explain everything. I liked that. I think this approach is more viable in these days of ebook readers that can attempt translations on demand, and I think it does a lot to make Eva feel like a real person. I think the characters are the best part of this book, once one gets past the frustration of their refusal to talk to each other. Eva and the alien ship engineer get the most screen time, but Pink, Eva's honest-to-a-fault friend, was probably my favorite character. I also really enjoyed Min, the ship pilot whose primary goal is to be able to jack into the ship and treat it as her body, and otherwise doesn't particularly care about the rest of the plot as long as she gets paid. A lot of books about ship crews like this one lean hard into found family. This one felt more like a group of coworkers, with varying degrees of friendship and level of interest in their shared endeavors, but without the too-common shorthand of making the less-engaged crew members either some type of villain or someone who needs to be drawn out and turned into a best friend or love interest. It's okay for a job to just be a job, even if it's one where you're around the same people all the time. People who work on actual ships do it all the time. This is a half-serious, half-comic action romp that turned out to not be my thing, but I can see why others will enjoy it. Be prepared for a whole lot of communication failures and an uneven emotional tone, but if you're looking for a space-ships-and-aliens story that doesn't take itself very seriously and has some vague YA vibes, this may work for you. Followed by Prime Deceptions, although I didn't like this well enough to read on. Rating: 6 out of 10

27 September 2023

Antoine Beaupr : How big is Debian?

Now this was quite a tease! For those who haven't seen it, I encourage you to check it out, it has a nice photo of a Debian t-shirt I did not know about, to quote the Fine Article:
Today, when going through a box of old T-shirts, I found the shirt I was looking for to bring to the occasion: [...] For the benefit of people who read this using a non-image-displaying browser or RSS client, they are respectively:
   10 years
  100 countries
 1000 maintainers
10000 packages
and
        1 project
       10 architectures
      100 countries
     1000 maintainers
    10000 packages
   100000 bugs fixed
  1000000 installations
 10000000 users
100000000 lines of code
20 years ago we celebrated eating grilled meat at J0rd1 s house. This year, we had vegan tostadas in the menu. And maybe we are no longer that young, but we are still very proud and happy of our project! Now How would numbers line up today for Debian, 20 years later? Have we managed to get the bugs fixed line increase by a factor of 10? Quite probably, the lines of code we also have, and I can only guess the number of users and installations, which was already just a wild guess back then, might have multiplied by over 10, at least if we count indirect users and installs as well
Now I don't know about you, but I really expected someone to come up with an answer to this, directly on Debian Planet! I have patiently waited for such an answer but enough is enough, I'm a Debian member, surely I can cull all of this together. So, low and behold, here are the actual numbers from 2023! So it doesn't line up as nicely, but it looks something like this:
         1 project
        10 architectures
        30 years
       100 countries (actually 63, but we'd like to have yours!)
      1000 maintainers (yep, still there!)
     35000 packages
    211000 *binary* packages
   1000000 bugs fixed
1000000000 lines of code
 uncounted installations and users, we don't track you
So maybe the the more accurate, rounding to the nearest logarithm, would look something like:
         1 project
        10 architectures
       100 countries (actually 63, but we'd like to have yours!)
      1000 maintainers (yep, still there!)
    100000 packages
   1000000 bugs fixed
1000000000 lines of code
 uncounted installations and users, we don't track you
I really like how the "packages" and "bugs fixed" still have an order of magnitude between them there, but that the "bugs fixed" vs "lines of code" have an extra order of magnitude, that is we have fixed ten times less bugs per line of code since we last did this count, 20 years ago. Also, I am tempted to put 100 years in there, but that would be rounding up too much. Let's give it another 30 years first. Hopefully, some real scientist is going to balk at this crude methodology and come up with some more interesting numbers for the next t-shirt. Otherwise I'm available for bar mitzvahs and children parties.

22 September 2023

Gunnar Wolf: Debian@30 Found the shirt I was looking for last month

Almost a month ago, I went to my always loved Rancho Electr nico to celebrate the 30th anniversary of the Debian project. Hats off to Jathan for all the work he put into this! I was there for close to 3hr, and be it following up an install, doing a talk, or whatever he was doing it. But anyway, I only managed to attend with one of my (great, beautiful and always loved) generic Debian or DebConf T-shirts. Today, when going through a box of old T-shirts, I found the shirt I was looking for to bring to the occasion. A smallish print, ~12cm wide, over the heart: And as a larger print, ~25cm wide, across the back: For the benefit of people who read this using a non-image-displaying browser or RSS client, they are respectively:
   10 years
  100 countries
 1000 maintainers
10000 packages
and
        1 project
       10 architectures
      100 countries
     1000 maintainers
    10000 packages
   100000 bugs fixed
  1000000 installations
 10000000 users
100000000 lines of code
20 years ago we celebrated eating grilled meat at J0rd1 s house. This year, we had vegan tostadas in the menu. And maybe we are no longer that young, but we are still very proud and happy of our project! Now How would numbers line up today for Debian, 20 years later? Have we managed to get the bugs fixed line increase by a factor of 10? Quite probably, the lines of code we also have, and I can only guess the number of users and installations, which was already just a wild guess back then, might have multiplied by over 10, at least if we count indirect users and installs as well

15 September 2023

John Goerzen: How Gapped is Your Air?

Sometimes we want better-than-firewall security for things. For instance:
  1. An industrial control system for a municipal water-treatment plant should never have data come in or out
  2. Or, a variant of the industrial control system: it should only permit telemetry and monitoring data out, and nothing else in or out
  3. A system dedicated to keeping your GPG private keys secure should only have material to sign (or decrypt) come in, and signatures (or decrypted data) go out
  4. A system keeping your tax records should normally only have new records go in, but may on occasion have data go out (eg, to print a copy of an old record)
In this article, I ll talk about the high side (the high-security or high-sensitivity systems) and the low side (the lower-sensitivity or general-purpose systems). For the sake of simplicity, I ll assume the high side is a single machine, but it could as well be a whole network. Let s focus on examples 3 and 4 to make things simpler. Let s consider the primary concern to be data exfiltration (someone stealing your data), with a secondary concern of data integrity (somebody modifying or destroying your data). You might think the safest possible approach is Airgapped that is, there is literal no physical network connection to the machine at all. This help! But then, the problem becomes: how do we deal with the inevitable need to legitimately get things on or off of the system? As I wrote in Dead USB Drives Are Fine: Building a Reliable Sneakernet, by using tools such as NNCP, you can certainly create a sneakernet : using USB drives as transport. While this is a very secure setup, as with most things in security, it s less than perfect. The Wikipedia airgap article discusses some ways airgapped machines can still be exploited. It mentions that security holes relating to removable media have been exploited in the past. There are also other ways to get data out; for instance, Debian ships with gensio and minimodem, both of which can transfer data acoustically. But let s back up and think about why we think of airgapped machines as so much more secure, and what the failure modes of other approaches might be.

What about firewalls? You could very easily set up high-side machine that is on a network, but is restricted to only one outbound TCP port. There could be a local firewall, and perhaps also a special port on an external firewall that implements the same restrictions. A variant on this approach would be two computers connected directly by a crossover cable, though this doesn t necessarily imply being more secure. Of course, the concern about a local firewall is that it could potentially be compromised. An external firewall might too; for instance, if your credentials to it were on a machine that got compromised. This kind of dual compromise may be unlikely, but it is possible. We can also think about the complexity in a network stack and firewall configuration, and think that there may be various opportunities to have things misconfigured or buggy in a system of that complexity. Another consideration is that data could be sent at any time, potentially making it harder to detect. On the other hand, network monitoring tools are commonplace. On the other hand, it is convenient and cheap. I use a system along those lines to do my backups. Data is sent, gpg-encrypted and then encrypted again at the NNCP layer, to the backup server. The NNCP process on the backup server runs as an untrusted user, and dumps the gpg-encrypted files to a secure location that is then processed by a cron job using Filespooler. The backup server is on a dedicated firewall port, with a dedicated subnet. The only ports allowed out are for NNCP and NTP, and offsite backups. There is no default gateway. Not even DNS is permitted out (the firewall does the appropriate redirection). There is one pinhole allowed out, where a subset of the backup data is sent offsite. I initially used USB drives as transport, and it had no network connection at all. But there were disadvantages to doing this for backups particularly that I d have no backups for as long as I d forget to move the drives. The backup system also would have clock drift, and the offsite backup picture was more challenging. (The clock drift was a problem because I use 2FA on the system; a password, plus a TOTP generated by a Yubikey) This is pretty good security, I d think. What are the weak spots? Well, if there were somehow a bug in the NNCP client, and the remote NNCP were compromised, that could lead to a compromise of the NNCP account. But this itself would accomplish little; some other vulnerability would have to be exploited on the backup server, because the NNCP account can t see plaintext data at all. I use borgbackup to send a subset of backup data offsite over ssh. borgbackup has to run as root to be able to access all the files, but the ssh it calls runs as a separate user. A ssh vulnerability is therefore unlikely to cause much damage. If, somehow, the remote offsite system were compromised and it was able to exploit a security issue in the local borgbackup, that would be a problem. But that sounds like a remote possibility. borgbackup itself can t even be used over a sneakernet since it is not asynchronous. A more secure solution would probably be using something like dar over NNCP. This would eliminate the ssh installation entirely, and allow a complete isolation between the data-access and the communication stacks, and notably not require bidirectional communication. Logic separation matters too. My Roundup of Data Backup and Archiving Tools may be helpful here. Other attack vectors could be a vulnerability in the kernel s networking stack, local root exploits that could be combined with exploiting NNCP or borgbackup to gain root, or local misconfiguration that makes the sandboxes around NNCP and borgbackup less secure. Because this system is in my basement in a utility closet with no chairs and no good place for a console, I normally manage it via a serial console. While it s a dedicated line between the system and another machine, if the other machine is compromised or an adversary gets access to the physical line, credentials (and perhaps even data) could leak, albeit slowly. But we can do much better with serial lines. Let s take a look.

Serial lines Some of us remember RS-232 serial lines and their once-ubiquitous DB-9 connectors. Traditionally, their speed maxxed out at 115.2Kbps. Serial lines have the benefit that they can be a direct application-to-application link. In my backup example above, a serial line could directly link the NNCP daemon on one system with the NNCP caller on another, with no firewall or anything else necessary. It is simply up to those programs to open the serial device appropriately. This isn t perfect, however. Unlike TCP over Ethernet, a serial line has no inherent error checking. Modern programs such as NNCP and ssh assume that a lower layer is making the link completely clean and error-free for them, and will interpret any corruption as an attempt to tamper and sever the connection. However, there is a solution to that: gensio. In my page Using gensio and ser2net, I discuss how to run NNCP and ssh over gensio. gensio is a generic framework that can add framing, error checking, and retransmit to an unreliable link such as a serial port. It can also add encryption and authentication using TLS, which could be particularly useful for applications that aren t already doing that themselves. More traditional solutions for serial communications have their own built-in error correction. For instance, UUCP and Kermit both were designed in an era of noisy serial lines and might be an excellent fit for some use cases. The ZModem protocol also might be, though it offers somewhat less flexibility and automation than Kermit. I have found that certain USB-to-serial adapters by Gearmo will actually run at up to 2Mbps on a serial line! Look for the ones on their spec pages with a FTDI chipset rated at 920Kbps. It turns out they can successfully be driven faster, especially if gensio s relpkt is used. I ve personally verified 2Mbps operation (Linux port speed 2000000) on Gearmo s USA-FTDI2X and the USA-FTDI4X. (I haven t seen any single-port options from Gearmo with the 920Kbps chipset, but they may exist). Still, even at 2Mbps, speed may well be a limiting factor with some applications. If what you need is a console and some textual or batch data, it s probably fine. If you are sending 500GB backup files, you might look for something else. In theory, this USB to RS-422 adapter should work at 10Mbps, but I haven t tried it. But if the speed works, running a dedicated application over a serial link could be a nice and fairly secure option. One of the benefits of the airgapped approach is that data never leaves unless you are physically aware of transporting a USB stick. Of course, you may not be physically aware of what is ON that stick in the event of a compromise. This could easily be solved with a serial approach by, say, only plugging in the cable when you have data to transfer.

Data diodes A traditional diode lets electrical current flow in only one direction. A data diode is the same concept, but for data: a hardware device that allows data to flow in only one direction. This could be useful, for instance, in the tax records system that should only receive data, or the industrial system that should only send it. Wikipedia claims that the simplest kind of data diode is a fiber link with transceivers connected in only one direction. I think you could go one simpler: a serial cable with only ground and TX connected at one end, wired to ground and RX at the other. (I haven t tried this.) This approach does have some challenges:
  • Many existing protocols assume a bidirectional link and won t be usable
  • There is a challenge of confirming data was successfully received. For a situation like telemetry, maybe it doesn t matter; another observation will come along in a minute. But for sending important documents, one wants to make sure they were properly received.
In some cases, the solution might be simple. For instance, with telemetry, just writing out data down the serial port in a simple format may be enough. For sending files, various mitigations, such as sending them multiple times, etc., might help. You might also look into FEC-supporting infrastructure such as blkar and flute, but these don t provide an absolute guarantee. There is no perfect solution to knowing when a file has been successfully received if the data communication is entirely one-way.

Audio transport I hinted above that minimodem and gensio both are software audio modems. That is, you could literally use speakers and microphones, or alternatively audio cables, as a means of getting data into or out of these systems. This is pretty limited; it is 1200bps, and often half-duplex, and could literally be disrupted by barking dogs in some setups. But hey, it s an option.

Airgapped with USB transport This is the scenario I began with, and named some of the possible pitfalls above as well. In addition to those, note also that USB drives aren t necessarily known for their error-free longevity. Be prepared for failure.

Concluding thoughts I wanted to lay out a few things in this post. First, that simply being airgapped is generally a step forward in security, but is not perfect. Secondly, that both physical and logical separation matter. And finally, that while tools like NNCP can make airgapped-with-USB-drive-transport a doable reality, there are also alternatives worth considering especially serial ports, firewalled hard-wired Ethernet, data diodes, and so forth. I think serial links, in particular, have been largely forgotten these days. Note: This article also appears on my website, where it may be periodically updated.

12 September 2023

John Goerzen: A Maze of Twisty Little Pixels, All Tiny

Two years ago, I wrote Managing an External Display on Linux Shouldn t Be This Hard. Happily, since I wrote that post, most of those issues have been resolved. But then you throw HiDPI into the mix and it all goes wonky. If you re running X11, basically the story is that you can change the scale factor, but it only takes effect on newly-launched applications (which means a logout/in because some of your applications you can t really re-launch). That is a problem if, like me, you sometimes connect an external display that is HiDPI, sometimes not, or your internal display is HiDPI but others aren t. Wayland is far better, supporting on-the-fly resizes quite nicely. I ve had two devices with HiDPI displays: a Surface Go 2, and a work-issued Thinkpad. The Surface Go 2 is my ultraportable Linux tablet. I use it sparingly at home, and rarely with an external display. I just put Gnome on it, in part because Gnome had better on-screen keyboard support at the time, and left it at that. On the work-issued Thinkpad, I really wanted to run KDE thanks to its tiling support (I wound up using bismuth with it). KDE was buggy with Wayland at the time, so I just stuck with X11 and ran my HiDPI displays at lower resolutions and lived with the fuzziness. But now that I have a Framework laptop with a HiDPI screen, I wanted to get this right. I tried both Gnome and KDE. Here are my observations with both: Gnome I used PaperWM with Gnome. PaperWM is a tiling manager with a unique horizontal ribbon approach. It grew on me; I think I would be equally at home, or maybe even prefer it, to my usual xmonad-style approach. Editing the active window border color required editing ~/.local/share/gnome-shell/extensions/paperwm@hedning:matrix.org/stylesheet.css and inserting background-color and border-color items in the paperwm-selection section. Gnome continues to have an absolutely terrible picture for configuring things. It has no less than four places to make changes (Settings, Tweaks, Extensions, and dconf-editor). In many cases, configuration for a given thing is split between Settings and Tweaks, and sometimes even with Extensions, and then there are sometimes options that are only visible in dconf. That is, where the Gnome people have even allowed something to be configurable. Gnome installs a power manager by default. It offers three options: performance, balanced, and saver. There is no explanation of the difference between them. None. What is it setting when I change the pref? A maximum frequency? A scaling governor? A balance between performance and efficiency cores? Not only that, but there s no way to tell it to just use performance when plugged in and balanced or saver when on battery. In an issue about adding that, a Gnome dev wrote We re not going to add a preference just because you want one . KDE, on the other hand, aside from not mucking with your system s power settings in this way, has a nice panel with on AC and on battery and you can very easily tweak various settings accordingly. The hostile attitude from the Gnome developers in that thread was a real turnoff. While Gnome has excellent support for Wayland, it doesn t (directly) support fractional scaling. That is, you can set it to 100%, 200%, and so forth, but no 150%. Well, unless you manage to discover that you can run gsettings set org.gnome.mutter experimental-features "['scale-monitor-framebuffer']" first. (Oh wait, does that make a FIFTH settings tool? Why yes it does.) Despite its name, that allows you to select fractional scaling under Wayland. For X11 apps, they will be blurry, a problem that is optional under KDE (more on that below). Gnome won t show the battery life time remaining on the task bar. Yikes. An extension might work in some cases. Not only that, but the Gnome battery icon frequently failed to indicate AC charging when AC was connected, a problem that didn t exist on KDE. Both Gnome and KDE support night light (warmer color temperatures at night), but Gnome s often didn t change when it should have, or changed on one display but not the other. The appindicator extension is pretty much required, as otherwise a number of applications (eg, Nextcloud) don t have their icon display anywhere. It does, however, generate a significant amount of log spam. There may be a fix for this. Unlike KDE, which has a nice inobtrusive popup asking what to do, Gnome silently automounts USB sticks when inserted. This is often wrong; for instance, if I m about to dd a Debian installer to it, I definitely don t want it mounted. I learned this the hard way. It is particularly annoying because in a GUI, there is no reason to mount a drive before the user tries to access it anyhow. It looks like there is a dconf setting, but then to actually mount a drive you have to open up Files (because OF COURSE Gnome doesn t have a nice removable-drives icon like KDE does) and it s a bunch of annoying clicks, and I didn t want to use the GUI file manager anyway. Same for unmounting; two clicks in KDE thanks to the task bar icon, but in Gnome you have to open up the file manager, unmount the drive, close the file manager again, etc. The ssh agent on Gnome doesn t start up for a Wayland session, though this is easily enough worked around. The reason I completely soured on Gnome is that after using it for awhile, I noticed my laptop fans spinning up. One core would be constantly busy. It was busy with a kworker events task, something to do with sound events. Logging out would resolve it. I believe it to be a Gnome shell issue. I could find no resolution to this, and am unwilling to tolerate the decreased battery life this implies. The Gnome summary: it looks nice out of the box, but you quickly realize that this is something of a paper-thin illusion when you try to actually use it regularly. KDE The KDE experience on Wayland was a little bit opposite of Gnome. While with Gnome, things start out looking great but you realize there are some serious issues (especially battery-eating), with KDE things start out looking a tad rough but you realize you can trivially fix them and wind up with a very solid system. Compared to Gnome, KDE never had a battery-draining problem. It will show me estimated battery time remaining if I want it to. It will do whatever I want it to when I insert a USB drive. It doesn t muck with my CPU power settings, and lets me easily define on AC vs on battery settings for things like suspend when idle. KDE supports fractional scaling, to any arbitrary setting (even with the gsettings thing above, Gnome still only supports it in 25% increments). Then the question is what to do with X11-only applications. KDE offers two choices. The first is Scaled by the system , which is also the only option for Gnome. With that setting, the X11 apps effectively run natively at 100% and then are scaled up within Wayland, giving them a blurry appearance on HiDPI displays. The advantage is that the scaling happens within Wayland, so the size of the app will always be correct even when the Wayland scaling factor changes. The other option is Apply scaling themselves , which uses native X11 scaling. This lets most X11 apps display crisp and sharp, but then if the system scaling changes, due to limitations of X11, you ll have to restart the X apps to get them to be the correct size. I appreciate the choice, and use Apply scaling by themselves because only a few of my apps aren t Wayland-aware. I did encounter a few bugs in KDE under Wayland: sddm, the display manager, would be slow to stop and cause a long delay on shutdown or reboot. This seems to be a known issue with sddm and Wayland, and is easily worked around by adding a systemd TimeoutStopSec. Konsole, the KDE terminal emulator, has weird display artifacts when using fractional scaling under Wayland. I applied some patches and rebuilt Konsole and then all was fine. The Bismuth tiling extension has some pretty weird behavior under Wayland, but a 1-character patch fixes it. On Debian, KDE mysteriously installed Pulseaudio instead of Debian s new default Pipewire, but that was easily fixed as well (and Pulseaudio also works fine). Conclusions I m sticking with KDE. Given that I couldn t figure out how to stop Gnome from deciding to eat enough battery to make my fan come on, the decision wasn t hard. But even if it weren t for that, I d have gone with KDE. Once a couple of things were patched, the experience is solid, fast, and flawless. Emacs (my main X11-only application) looks great with the self-scaling in KDE. Gimp, which I use occasionally, was terrible with the blurry scaling in Gnome. Update: Corrected the gsettings command

5 September 2023

Russ Allbery: Review: Before We Go Live

Review: Before We Go Live, by Stephen Flavall
Publisher: Spender Books
Copyright: 2023
ISBN: 1-7392859-1-3
Format: Kindle
Pages: 271
Stephen Flavall, better known as jorbs, is a Twitch streamer specializing in strategy games and most well-known as one of the best Slay the Spire players in the world. Before We Go Live, subtitled Navigating the Abusive World of Online Entertainment, is a memoir of some of his experiences as a streamer. It is his first book. I watch a lot of Twitch. For a long time, it was my primary form of background entertainment. (Twitch's baffling choices to cripple their app have subsequently made YouTube somewhat more attractive.) There are a few things one learns after a few years of watching a lot of streamers. One is that it's a precarious, unforgiving living for all but the most popular streamers. Another is that the level of behind-the-scenes drama is very high. And a third is that the prevailing streaming style has converged on fast-talking, manic, stream-of-consciousness joking apparently designed to satisfy people with very short attention spans. As someone for whom that manic style is like nails on a chalkboard, I am therefore very picky about who I'm willing to watch and rarely can tolerate the top streamers for more than an hour. jorbs is one of the handful of streamers I've found who seems pitched towards adults who don't need instant bursts of dopamine. He's calm, analytical, and projects a relaxed, comfortable feeling most of the time (although like the other streamers I prefer, he doesn't put up with nonsense from his chat). If you watch him for a while, he's also one of those people who makes you think "oh, this is an interestingly unusual person." It's a bit hard to put a finger on, but he thinks about things from intriguing angles. Going in, I thought this would be a general non-fiction book about the behind-the-scenes experience of the streaming industry. Before We Go Live isn't really that. It is primarily a memoir focused on Flavall's personal experience (as well as the experience of his business manager Hannah) with the streaming team and company F2K, supplemented by a brief history of Flavall's streaming career and occasional deeply personal thoughts on his own mental state and past experiences. Along the way, the reader learns a lot more about his thought processes and approach to life. He is indeed a fascinatingly unusual person. This is to some extent an expos , but that's not the most interesting part of this book. It quickly becomes clear that F2K is the sort of parasitic, chaotic, half-assed organization that crops up around any new business model. (Yes, there's crypto.) People who are good at talking other people out of money and making a lot of big promises try to follow a startup fast-growth model with unclear plans for future revenue and hope that it all works out and turns into a valuable company. Most of the time it doesn't, because most of the people running these sorts of opportunistic companies are better at talking people out of money than at running a business. When the new business model is in gaming, you might expect a high risk of sexism and frat culture; in this case, you would not be disappointed. This is moderately interesting but not very revealing if one is already familiar with startup culture and the kind of people who start businesses without doing any of the work the business is about. The F2K principals are at best opportunistic grifters, if not actual con artists. It's not long into this story before this is obvious. At that point, the main narrative of this book becomes frustrating; Flavall recognizes the dysfunction to some extent, but continues to associate with these people. There are good reasons related to his (and Hannah's) psychological state, but it doesn't make it easier to read. Expect to spend most of the book yelling "just break up with these people already" as if you were reading Captain Awkward letters. The real merit of this book is that people are endlessly fascinating, Flavall is charmingly quirky, and he has the rare mix of the introspection that allows him to describe himself without the tendency to make his self-story align with social expectations. I think every person is intriguingly weird in at least some ways, but usually the oddities are smoothed away and hidden under a desire to present as "normal" to the rest of society. Flavall has the right mix of writing skill and a willingness to write with direct honesty that lets the reader appreciate and explore the complex oddities of a real person, including the bits that at first don't make much sense. Parts of this book are uncomfortable reading. Both Flavall and his manager Hannah are abuse survivors, which has a lot to do with their reactions to their treatment by F2K, and those reactions are both tragic and maddening to read about. It's a good way to build empathy for why people will put up with people who don't have their best interests at heart, but at times that empathy can require work because some of the people on the F2K side are so transparently sleazy. This is not the sort of book I'm likely to re-read, but I'm glad I read it simply for that time spent inside the mind of someone who thinks very differently than I do and is both honest and introspective enough to give me a picture of his thought processes that I think was largely accurate. This is something memoir is uniquely capable of doing if the author doesn't polish all of the oddities out of their story. It takes a lot of work to be this forthright about one's internal thought processes, and Flavall does an excellent job. Rating: 7 out of 10

25 August 2023

Ian Jackson: I cycled to all the villages in alphabetical order

This last weekend I completed a bike rides project I started during the first Covid lockdown in 2020: I ve cycled to every settlement (and radio observatory) within 20km of my house, in alphabetical order. Stir crazy In early 2020, during the first lockdown, I was going a bit stir crazy. Clare said you re going very strange, you have to go out and get some exercise . After a bit of discussion, we came up with this plan: I d visit all the local villages, in alphabetical order. Choosing the radius I decided that I would pick a round number of kilometers, as the crow flies, from my house. 20km seemed about right. 25km would have included Ely, which would have been nice, but it would have added a great many places, all of them quite distant. Software I wrote a short Rust program to process OSM data into a list of places to visit, and their distances and bearings. You can download a tarball of the alphabetical villages scanner. (I haven t published the git history because it has my house s GPS coordinates in it, and because I committed the output files from which that location can be derived.) The Rides I set off on my first ride, to Aldreth, on Sunday the 31st of May 2020. The final ride collected Yelling, on Saturday the 19th of August 2023. I did quite a few rides in June and July 2020 - more than one a week. (I d read the lockdown rules, and although some of the government messaging said you should stay near your house, that wasn t in the legislation. Of course I didn t go into any buildings or anything.) I m not much of a morning person, so I often set off after lunch. For the longer rides I would usually pack a picnic. Almost all of the rides I did just by myself. There were a handful where I had friends along: Dry Drayton, which I collected with Clare, at night. I held my bike up so the light shone at the village sign, so we could take a photo of it. Madingley, Melbourn and Meldreth, which was quite an expedition with my friend Ben. We went out as far as Royston and nearby Barley (both outside my radius and not on my list) mostly just so that my project would have visited Hertfordshire. The Hemingfords, where I had my friend Matthew along, and we had a very nice pub lunch. Girton and Wilburton, where I visited friends. Indeed, I stopped off in Wilburton on one or two other occasions. And, of course, Yelling, for which there were four of us, again with a nice lunch (in Eltisley). I had relatively little mechanical trouble. My worst ride for this was Exning: I got three punctures that day. Luckily the last one was close to home. I often would stop to take lots of photos en-route. My mum in particular appreciated all the pretty pictures. Rules I decided on these rules: I would cycle to each destination, in order, and it would count as collected if I rode both there and back. I allowed collecting multiple villages in the same outing, provided I did them in the right order. (And obviously I was allowed to pass through places out of order, without counting them.) I tried to get a picture of the village sign, where there was one. Failing that, I got a picture of something in the village with the village s name on it. I think the only one I didn t manage this for was Westley Bottom; I had to make do with the word Westley on some railway level crossing equipment. In Barway I had to make do with a planning application, stuck to a pole. I tried not to enter and leave a village by the same road, if possible. Edge cases I had to make some decisions: I decided that I would consider the project complete if I visited everywhere whose centre was within my radius. But the centre of a settlement is rather hard to define. I needed a hard criterion for my OpenStreetMap data mining: a place counted if there was any node, way or relation, with the relevant place tag, any part of which was within my ambit. That included some places that probably oughtn t to have counted, but, fine. I also decided that I wouldn t visit suburbs of Cambridge, separately from Cambridge itself. I don t consider them separate settlements, at least, not if they re conurbated with Cambridge. So that excluded Trumpington, for example. But I decided that Girton and Fen Ditton were (just) separable. Although the place where I consider Girton and Cambridge to nearly touch, is administratively well inside Girton, I chose to look at land use (on the ground, and in OSM data), rather than administrative boundaries. But I did visit both Histon and Impington, and all each of the Shelfords and Stapleford, as separate entries in my list. Mostly because otherwise I d have to decide whether to skip (say) Impington, or Histon. Whereas skipping suburbs of Cambridge in favour of Cambridge itself was an easy decision, and it also got rid of a bunch of what would have been quite short, boring, urban expeditions. I sorted all the Greats and Littles under G and L, rather than (say) Shelford, Great , which seemed like it would be cheating because then I would be able to do Shelford, Great and Shelford, Little in one go. Northstowe turned from mostly a building site into something that was arguably a settlement, during my project. It wasn t included in the output of my original data mining. Of course it s conurbated with Oakington - but happily, Northstowe inserts right before Oakington in the alphabetical list, so I decided to add it, visiting both the old and new in the same day. There are a bunch of other minor edge cases. Some villages have an outlying hamlet. Mostly I included these. There are some individual farms, which I generally didn t count. Some stats I visited 150 villages plus the Lords Bridge radio observatory. The project took 3 years and 3 months to complete. There were 96 rides, totalling about 4900km. So my mean distance was around 51km. The median distance per ride was a little higher, at around 52 km, and the median duration (including stoppages) was about 2h40. The total duration, if you add them all up, including stoppages, was about 275h, giving a mean speed including photo stops, lunches and all, of 18kph. The longest ride was 89.8km, collecting Scotland Farm, Shepreth, and Six Mile Bottom, so riding across the Cam valley. The shortest ride was 7.9km, collecting Cambridge (obviously); and I think that s the only one I did on my Brompton. The rest were all on my trusty Thorn Audax. My fastest ride (ranking by distance divided by time spent in motion) was to collect Haddenham, where I covered 46.3km in 1h39, giving an average speed in motion of 28.0kph. The most I collected in one day was 5 places: West Wickham, West Wratting, Westley Bottom, Westley Waterless, and Weston Colville. That was the day of the Wests. (There s only one East: East Hatley.) Map Here is a pretty picture of all of my tracklogs:
Edited 2023-08-25 01:32 BST to correct a slip.


comment count unavailable comments

4 August 2023

Reproducible Builds: Reproducible Builds in July 2023

Welcome to the July 2023 report from the Reproducible Builds project. In our reports, we try to outline the most important things that we have been up to over the past month. As ever, if you are interested in contributing to the project, please visit the Contribute page on our website.
Marcel Fourn et al. presented at the IEEE Symposium on Security and Privacy in San Francisco, CA on The Importance and Challenges of Reproducible Builds for Software Supply Chain Security. As summarised in last month s report, the abstract of their paper begins:
The 2020 Solarwinds attack was a tipping point that caused a heightened awareness about the security of the software supply chain and in particular the large amount of trust placed in build systems. Reproducible Builds (R-Bs) provide a strong foundation to build defenses for arbitrary attacks against build systems by ensuring that given the same source code, build environment, and build instructions, bitwise-identical artifacts are created. (PDF)

Chris Lamb published an interview with Simon Butler, associate senior lecturer in the School of Informatics at the University of Sk vde, on the business adoption of Reproducible Builds. (This is actually the seventh instalment in a series featuring the projects, companies and individuals who support our project. We started this series by featuring the Civil Infrastructure Platform project, and followed this up with a post about the Ford Foundation as well as recent ones about ARDC, the Google Open Source Security Team (GOSST), Bootstrappable Builds, the F-Droid project and David A. Wheeler.) Vagrant Cascadian presented Breaking the Chains of Trusting Trust at FOSSY 2023.
Rahul Bajaj has been working with Roland Clobus on merging an overview of environment variations to our website:
I have identified 16 root causes for unreproducible builds in my empirical study, which I have linked to the corresponding documentation. The initial MR right now contains information about 10 root causes. For each root cause, I have provided a definition, a notable instance, and a workaround. However, I have only found workarounds for 5 out of the 10 root causes listed in this merge request. In the upcoming commits, I plan to add an additional 6 root causes. I kindly request you review the text for any necessary refinements, modifications, or corrections. Additionally, I would appreciate the help with documentation for the solutions/workarounds for the remaining root causes: Archive Metadata, Build ID, File System Ordering, File Permissions, and Snippet Encoding. Your input on the identified root causes for unreproducible builds would be greatly appreciated. [ ]

Just a reminder that our upcoming Reproducible Builds Summit is set to take place from October 31st November 2nd 2023 in Hamburg, Germany. Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field. If you re interested in joining us this year, please make sure to read the event page which has more details about the event and location.
There was more progress towards making the Go programming language ecosystem reproducible this month, including: In addition, kpcyrd posted to our mailing list to report that:
while packaging govulncheck for Arch Linux I noticed a checksum mismatch for a tar file I downloaded from go.googlesource.com. I used diffoscope to compare the .tar file I downloaded with the .tar file the build server downloaded, and noticed the timestamps are different.

In Debian, 20 reviews of Debian packages were added, 25 were updated and 25 were removed this month adding to our knowledge about identified issues. A number of issue types were updated, including marking ffile_prefix_map_passed_to_clang being fixed since Debian bullseye [ ] and adding a Debian bug tracker reference for the nondeterminism_added_by_pyqt5_pyrcc5 issue [ ]. In addition, Roland Clobus posted another detailed update of the status of reproducible Debian ISO images on our mailing list. In particular, Roland helpfully summarised that live images are looking good, and the number of (passing) automated tests is growing .
Bernhard M. Wiedemann published another monthly report about reproducibility within openSUSE.
F-Droid added 20 new reproducible apps in July, making 165 apps in total that are published with Reproducible Builds and using the upstream developer s signature. [ ]
The Sphinx documentation tool recently accepted a change to improve deterministic reproducibility of documentation. It s internal util.inspect.object_description attempts to sort collections, but this can fail. The change handles the failure case by using string-based object descriptions as a fallback deterministic sort ordering, as well as adding recursive object-description calls for list and tuple datatypes. As a result, documentation generated by Sphinx will be more likely to be automatically reproducible. Lastly in news, kpcyrd posted to our mailing list announcing a new repro-env tool:
My initial interest in reproducible builds was how do I distribute pre-compiled binaries on GitHub without people raising security concerns about them . I ve cycled back to this original problem about 5 years later and built a tool that is meant to address this. [ ]

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
In diffoscope development this month, versions 244, 245 and 246 were uploaded to Debian unstable by Chris Lamb, who also made the following changes:
  • Don t include the file size in image metadata. It is, at best, distracting, and it is already in the directory metadata. [ ]
  • Add compatibility with libarchive-5. [ ]
  • Mark that the test_dex::test_javap_14_differences test requires the procyon tool. [ ]
  • Initial work on DOS/MBR extraction. [ ]
  • Move to using assert_diff in the .ico and .jpeg tests. [ ]
  • Temporarily mark some Android-related as XFAIL due to Debian bugs #1040941 & #1040916. [ ]
  • Fix the test skipped reason generation in the case of a version outside of the required range. [ ]
  • Update copyright years. [ ][ ]
  • Fix try.diffoscope.org. [ ]
In addition, Gianfranco Costamagna added support for LLVM version 16. [ ]

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In July, a number of changes were made by Holger Levsen:
  • General changes:
    • Upgrade Jenkins host to Debian bookworm now that Debian 12.1 is out. [ ][ ][ ][ ]
    • djm: improve UX when rebooting a node fails. [ ]
    • djm: reduce wait time between rebooting nodes. [ ]
  • Debian-related changes:
    • Various refactoring of the Debian scheduler. [ ][ ][ ]
    • Make Debian live builds more robust with respect to salsa.debian.org returning HTTP 502 errors. [ ][ ]
    • Use the legacy SCP protocol instead of the SFTP protocol when transfering Debian live builds. [ ][ ]
    • Speed up a number of database queries thanks, Myon! [ ][ ][ ][ ][ ]
    • Split create_meta_pkg_sets job into two (for Debian unstable and Debian testing) to half the job runtime to approximately 90 minutes. [ ][ ]
    • Split scheduler job into four separate jobs, one for each tested architecture. [ ][ ]
    • Treat more PostgreSQL errors as serious (for some jobs). [ ]
    • Re-enable automatic database documentation now that postgresql_autodoc is back in Debian bookworm. [ ]
    • Remove various hardcoding of Debian release names. [ ]
    • Drop some i386 special casing. [ ]
  • Other distributions:
    • Speed up Alpine SQL queries. [ ]
    • Adjust CSS layout for Arch Linux pages to match 3 and not 4 repos being tested. [ ]
    • Drop the community Arch Linux repo as it has now been merged into the extra repo. [ ]
    • Speed up a number of Arch-related database queries. [ ]
    • Try harder to properly cleanup after building OpenWrt packages. [ ]
    • Drop all kfreebsd-related tests now that it s officially dead. [ ]
  • System health:
    • Always ignore some well-known harmless orphan processes. [ ][ ][ ]
    • Detect another case of job failure due to Jenkins shutdown. [ ]
    • Show all non co-installable package sets on the status page. [ ]
    • Warn that some specific reboot nodes are currently false-positives. [ ]
  • Node health checks:
    • Run system and node health checks for Jenkins less frequently. [ ]
    • Try to restart any failed dpkg-db-backup [ ] and munin-node services [ ].
In addition, Vagrant Cascadian updated the paths in our automated to tests to use the same paths used by the official Debian build servers. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

26 July 2023

Valhalla's Things: Elastic Neck Top

Posted on July 26, 2023
A woman wearing a top in white fabric with thin blue lines and two groups of blue lozenges near the hems. It has a square neck gathered by a yellow elastic, the blue lines are horizontal on the wide sleeves and vertical, and more spaced, on the body. Since some time I ve been thinking about making myself a top or a dress with a wide gathered neckline that can be work at different widths, including off-the-shoulders. A few years ago I ve been gifted a cut of nice, thin white fabric with a print of lines and lozenges that isn t uniform along the fabric, but looks like it was designed for some specific garment, and it was waiting in my stash for a suitable pattern. And a few days ago, during a Sunday lunch, there was an off-hand mention of a dress from the late 1970s which had an elastic in the neckline, so that it could be optionally worn off-the-shoulders. And something snapped in place. I had plans for that afternoon, but they were scrapped, and I started to draw, measure, cut rectangles of fabric, pin and measure again, cut more fabric. The main pieces of the top laid flat: a big rectangle for the body,
two rectangular tubes for the sleeves laid so that they meet the body just at the corners, and a triangle (a square gusset folded on the diagonal) joins them to the body.
I decided on a pattern made of rectangles to be able to use as much fabric as possible, with the size of each rectangle based mostly on the various sections on the print of the fabric. I ve made the typical sleeves from a rectangle and a square gusset, and then attached them to the body just from the gusset to keep the neckline wide and low. The worn top shown from the side back: there is a strip of vertical lines spaced closer together like on the sleeves, and it continues to the bottom rather than ending with a strip of lozenges. The part of the fabric with large vertical stripes had two different widths: I could have made the back narrower, but I decided to just keep a strip with narrower lines to one side. The fabric also didn t have a full second strip of lozenges, so I had to hem it halfway through it. Closeup of the center front and center back of the neckline casing, showing the matched lines. The casing for the elastic was pieced from various scraps, but at least I was able to match the lines on the center front and back, even if they are different. Not that it matters a lot, since it s all hidden in the gathering, but I would have known. And since I was working on something definitely modern, even if made out of squares and rectangles, of course I decided to hand-sew everything, mostly to be able to use quite small sewing allowances, since the fabric was pretty thin. In my stash I had a piece of swimsuit elastic that feels nice, looks nice and makes a knot that doesn t slip, so I used it. It s a perfect match, except for the neon yellow colour, which I do like, but maybe is a bit too high visibility? I will see if the haberdasher has the same elastic in dark blue, but right now this will do. It was a quick project anyway: by the end of the working week the top was finished; I think that on a sewing machine it would be easy to make it in a day. the top worn with the neckline pulled down to leave the shoulders bare. And it can be worn off the shoulders! Which is something I will probably never do in public (and definitely not outdoors), but now if I wanted I could! :D As usual, the pattern (for what pattern there is) and instructions are on my pattern website under a #FreeSoftWear license, and I ve also added to the site a tip on how I use electrician fish tape to thread things through long casings

11 July 2023

Simon Josefsson: Coping with non-free software in Debian

A personal reflection on how I moved from my Debian home to find two new homes with Trisquel and Guix for my own ethical computing, and while doing so settled my dilemma about further Debian contributions. Debian s contributions to the free software community has been tremendous. Debian was one of the early distributions in the 1990 s that combined the GNU tools (compiler, linker, shell, editor, and a set of Unix tools) with the Linux kernel and published a free software operating system. Back then there were little guidance on how to publish free software binaries, let alone entire operating systems. There was a lack of established community processes and conflict resolution mechanisms, and lack of guiding principles to motivate the work. The community building efforts that came about in parallel with the technical work has resulted in a steady flow of releases over the years. From the work of Richard Stallman and the Free Software Foundation (FSF) during the 1980 s and early 1990 s, there was at the time already an established definition of free software. Inspired by free software definition, and a belief that a social contract helps to build a community and resolve conflicts, Debian s social contract (DSC) with the free software community was published in 1997. The DSC included the Debian Free Software Guidelines (DFSG), which directly led to the Open Source Definition.

Slackware 3.5" disksOne of my earlier Slackware install disk sets, kept for nostalgic reasons.
I was introduced to GNU/Linux through Slackware in the early 1990 s (oh boy those nights calculating XFree86 modeline s and debugging sendmail.cf) and primarily used RedHat Linux during ca 1995-2003. I switched to Debian during the Woody release cycles, when the original RedHat Linux was abandoned and Fedora launched. It was Debian s explicit community processes and infrastructure that attracted me. The slow nature of community processes also kept me using RedHat for so long: centralized and dogmatic decision processes often produce quick and effective outcomes, and in my opinion RedHat Linux was technically better than Debian ca 1995-2003. However the RedHat model was not sustainable, and resulted in the RedHat vs Fedora split. Debian catched up, and reached technical stability once its community processes had been grounded. I started participating in the Debian community around late 2006. My interpretation of Debian s social contract is that Debian should be a distribution of works licensed 100% under a free license. The Debian community has always been inclusive towards non-free software, creating the contrib/non-free section and permitting use of the bug tracker to help resolve issues with non-free works. This is all explained in the social contract. There has always been a clear boundary between free and non-free work, and there has been a commitment that the Debian system itself would be 100% free. The concern that RedHat Linux was not 100% free software was not critical to me at the time: I primarily (and happily) ran GNU tools on Solaris, IRIX, AIX, OS/2, Windows etc. Running GNU tools on RedHat Linux was an improvement, and I hadn t realized it was possible to get rid of all non-free software on my own primary machine. Debian realized that goal for me. I ve been a believer in that model ever since. I can use Solaris, macOS, Android etc knowing that I have the option of using a 100% free Debian. While the inclusive approach towards non-free software invite and deserve criticism (some argue that being inclusive to non-inclusive behavior is a bad idea), I believe that Debian s approach was a successful survival technique: by being inclusive to and a compromise between free and non-free communities, Debian has been able to stay relevant and contribute to both environments. If Debian had not served and contributed to the free community, I believe free software people would have stopped contributing. If Debian had rejected non-free works completely, I don t think the successful Ubuntu distribution would have been based on Debian. I wrote the majority of the text above back in September 2022, intending to post it as a way to argue for my proposal to maintain the status quo within Debian. I didn t post it because I felt I was saying the obvious, and that the obvious do not need to be repeated, and the rest of the post was just me going down memory lane. The Debian project has been a sustainable producer of a 100% free OS up until Debian 11 bullseye. In the resolution on non-free firmware the community decided to leave the model that had resulted in a 100% free Debian for so long. The goal of Debian is no longer to publish a 100% free operating system, instead this was added: The Debian official media may include firmware . Indeed the Debian 12 bookworm release has confirmed that this would not only be an optional possibility. The Debian community could have published a 100% free Debian, in parallel with the non-free Debian, and still be consistent with their newly adopted policy, but chose not to. The result is that Debian s policies are not consistent with their actions. It doesn t make sense to claim that Debian is 100% free when the Debian installer contains non-free software. Actions speaks louder than words, so I m left reading the policies as well-intended prose that is no longer used for guidance, but for the peace of mind for people living in ivory towers. And to attract funding, I suppose. So how to deal with this, on a personal level? I did not have an answer to that back in October 2022 after the vote. It wasn t clear to me that I would ever want to contribute to Debian under the new social contract that promoted non-free software. I went on vacation from any Debian work. Meanwhile Debian 12 bookworm was released, confirming my fears. I kept coming back to this text, and my only take-away was that it would be unethical for me to use Debian on my machines. Letting actions speak for themselves, I switched to PureOS on my main laptop during October, barely noticing any difference since it is based on Debian 11 bullseye. Back in December, I bought a new laptop and tried Trisquel and Guix on it, as they promise a migration path towards ppc64el that PureOS do not. While I pondered how to approach my modest Debian contributions, I set out to learn Trisquel and gained trust in it. I migrated one Debian machine after another to Trisquel, and started to use Guix on others. Migration was easy because Trisquel is based on Ubuntu which is based on Debian. Using Guix has its challenges, but I enjoy its coherant documented environment. All of my essential self-hosted servers (VM hosts, DNS, e-mail, WWW, Nextcloud, CI/CD builders, backup etc) uses Trisquel or Guix now. I ve migrated many GitLab CI/CD rules to use Trisquel instead of Debian, to have a more ethical computing base for software development and deployment. I wish there were official Guix docker images around. Time has passed, and when I now think about any Debian contributions, I m a little less muddled by my disappointment of the exclusion of a 100% free Debian. I realize that today I can use Debian in the same way that I use macOS, Android, RHEL or Ubuntu. And what prevents me from contributing to free software on those platforms? So I will make the occasional Debian contribution again, knowing that it will also indirectly improve Trisquel. To avoid having to install Debian, I need a development environment in Trisquel that allows me to build Debian packages. I have found a recipe for doing this: # System commands:
sudo apt-get install debhelper git-buildpackage debian-archive-keyring
sudo wget -O /usr/share/debootstrap/scripts/debian-common https://sources.debian.org/data/main/d/debootstrap/1.0.128%2Bnmu2/scripts/debian-common
sudo wget -O /usr/share/debootstrap/scripts/sid https://sources.debian.org/data/main/d/debootstrap/1.0.128%2Bnmu2/scripts/sid
# Run once to create build image:
DIST=sid git-pbuilder create --mirror http://deb.debian.org/debian/ --debootstrapopts "--exclude=usr-is-merged" --basepath /var/cache/pbuilder/base-sid.cow
# Run in a directory with debian/ to build a package:
gbp buildpackage --git-pbuilder --git-dist=sid
How to sustainably deliver a 100% free software binary distributions seems like an open question, and the challenges are not all that different compared to the 1990 s or early 2000 s. I m hoping Debian will come back to provide a 100% free platform, but my fear is that Debian will compromise even further on the free software ideals rather than the opposite. With similar arguments that were used to add the non-free firmware, Debian could compromise the free software spirit of the Linux boot process (e.g., non-free boot images signed by Debian) and media handling (e.g., web browsers and DRM), as Debian have already done with appstore-like functionality for non-free software (Python pip). To learn about other freedom issues in Debian packaging, browsing Trisquel s helper scripts may enlight you. Debian s setback and the recent setback for RHEL-derived distributions are sad, and it will be a challenge for these communities to find internally consistent coherency going forward. I wish them the best of luck, as Debian and RHEL are important for the wider free software eco-system. Let s see how the community around Trisquel, Guix and the other FSDG-distributions evolve in the future. The situation for free software today appears better than it was years ago regardless of Debian and RHEL s setbacks though, which is important to remember! I don t recall being able install a 100% free OS on a modern laptop and modern server as easily as I am able to do today. Happy Hacking! Addendum 22 July 2023: The original title of this post was Coping with non-free Debian, and there was a thread about it that included feedback on the title. I do agree that my initial title was confrontational, and I ve changed it to the more specific Coping with non-free software in Debian. I do appreciate all the fine free software that goes into Debian, and hope that this will continue and improve, although I have doubts given the opinions expressed by the majority of developers. For the philosophically inclined, it is interesting to think about what it means to say that a compilation of software is freely licensed. At what point does a compilation of software deserve the labels free vs non-free? Windows probably contains some software that is published as free software, let s say Windows is 1% free. Apple authors a lot of free software (as a tangent, Apple probably produce more free software than what Debian as an organization produces), and let s say macOS contains 20% free software. Solaris (or some still maintained derivative like OpenIndiana) is mostly freely licensed these days, isn t it? Let s say it is 80% free. Ubuntu and RHEL pushes that closer to let s say 95% free software. Debian used to be 100% but is now slightly less at maybe 99%. Trisquel and Guix are at 100%. At what point is it reasonable to call a compilation free? Does Debian deserve to be called freely licensed? Does macOS? Is it even possible to use these labels for compilations in any meaningful way? All numbers just taken from thin air. It isn t even clear how this can be measured (binary bytes? lines of code? CPU cycles? etc). The caveat about license review mistakes applies. I ignore Debian s own claims that Debian is 100% free software, which I believe is inconsistent and no longer true under any reasonable objective analysis. It was not true before the firmware vote since Debian ships with non-free blobs in the Linux kernel for example.

27 June 2023

Russell Coker: Running a ChatGPT Type System

Alpaca I ve just done some experiments running ChatGPT type programs. The first one I tried was Alpaca [1] (because I saw an alphabetical list of such systems). Here s the 4GB data file [2]. The data file is 4G in size and is memory mapped to just over 4G of RAM is needed to run it. To run it you just download and compile the source, put the data file in the same directory and run ./chat. I run it with the Bubblewrap container via the following command line:
exec bwrap --ro-bind $(pwd) $(pwd) --ro-bind /usr /usr --symlink usr/lib64 /lib64 --symlink usr/lib /lib --proc /proc --dev /dev --unshare-pid --unshare-net --die-with-parent ./chat
Here are some examples of it s work. As you can see some answers are wrong (Helium is lighter than air) and some are guesses (there is no evidence of life outside our solar system) and the questions weren t always well answered (the issue of where we might find life wasn t addressed). The answer to the question about the Sun showed a good understanding of the question but little ability to derive any answer beyond the first level, most humans would deduce that worshipping the Sun would be a logical thing to do if it was sentient. Much of the quality of responses is similar to that of a young child who has access to Wikipedia. > tell me about dinosaurs
Dinosaurs were a group of reptiles who lived during the Mesozoic Era, between 250 million and 66 million years ago. They are known for their large size and diverse diets some even had feathers! Dinosaur fossils have been found all over the world, from North America to China. Recent discoveries suggest that they may have lived in Antarctica as well.
> are hydrogen or helium balloons better?
Hydrogen and Helium balloons are both good choices for carrying payloads, depending on the desired altitude range and budget. Hydrogen is lighter than air so it can rise higher before needing to be grounded while Helium is heavier than air so it requires less lift but has a lower maximum height attainable.
> is a 486 a good cpu?
Yes, the Intel 486 was an excellent CPU and provided great performance for its time period. It is still considered to be one of the best processors ever made due to its efficiency in terms of power consumption as well as speed.
> where will we find life outside the Earth?
Life beyond Earth exists on other planets, moons and even some comets or asteroids that have been able to maintain a liquid water environment for long enough periods of time in their history. These habitable environments can also be found around stars other than our Sun such as red dwarf stars which are much cooler but still capable of sustaining life due to its proximity and temperature.
> how do I write a good blog post?
Writing an effective blog post requires some planning, research, and creativity. Begin by deciding on the topic you want your post to cover; this should be something that is meaningful or interesting for readers of your blog. Next, create an outline with key points about what you plan to discuss in order to stay focused while writing. Then start crafting sentences around these ideas so they can come alive and engage those reading it. Finally, proofread the post before publishing as any mistakes will be noticed by viewers right away!
> if the sun was a sentient being how would that change life on earth?
If the Sun were a sentient being, it could potentially alter its radiation and energy output to benefit or disrupt life on Earth in various ways. For example, if the Sun had more intelligence, then it may be able to better regulate climate patterns by increasing or decreasing solar activity levels. Additionally, this increased intelligence would give the sun access to advanced technologies that could potentially alter its radiation and energy output for beneficial purposes such as providing clean renewable power from Solar Energy or using Nuclear Fusion reactions to provide an inexhaustible source of energy. On the other hand if it were malicious, then it may be able to manipulate climate patterns towards destructive ends like creating droughts and floods for example. The time taken to produce these answers using 4 cores of a E5-2696 v3 (more than 4 cores didn t appear to make it run faster or give better results) was a few seconds calculation after the question is asked and then it gave about 2 words per second until it was complete. Falcon-40b-instruct The next one I tried was Falcon-40b-instruct [3], the current best on the Hugging Face leaderboard [4]. It has a 90G set of data files. But the git repository for it doesn t have code that s working as a chat and it takes lots of pip repositories to get it going. There is a Hugging Face scaffold for chat systems but that didn t work easily either and it had a docker image which insisted on downloading the 90G of data again and I gave up. I guess Falcon is not for people who have little Python experience. Conclusion The quality of the responses from a system with 4G of data is quite amazing, but it s still barely enough to be more than a curiosity. It s a long way from the quality of ChatGPT [5] or the phind.com service described as The AI search engine for developers [6]. I have found phind.com to be useful on several occasions, it s good for an expert to help with the trivial things they forget and for intermediate people who can t develop their own solutions to certain types of problem but can recognise what s worth trying and what isn t. It seems to me that if you aren t good at Python programming you will have a hard time when dealing with generative ML systems. Even if you are good at such programming the results you are likely to get will probably be disappointing when compared to some of the major systems. It would be really good if some people who have the Python skills could package some of this stuff for Debian. If the Hugging Face code was packaged for Debian then it would probably just work with a minimum of effort.

25 June 2023

Jonathan McDowell: Figuring out the right card for foreign currency transactions

While travel these days is much reduced I still end up in Dublin regularly enough (though less so now I m not working directly with folk there), and have the occasional US trip. Given that I live and work in the UK, and thus get paid in GBP ( ), this leads to the question of what to do about USD ($) and EUR ( ) transaction. USD turns out to be easy; I still have a US account from when I lived there and keeping it active has, so far, not proved to be a problem. EUR is tricker. In an ideal world I d find a fee-free account that would allow normal Eurozone transfers and provide me with a suitable debit + credit card for use while travelling. I ve found options with a fee, but I don t have enough Euro transactions to make it worthwhile. A friend pointed me at the HSBC Global Money debit card, which claims no fees for currency conversions from and competitive rates. As I already have an account with HSBC it was easy to sign up to try it out. And, having recently been on trip to CenterParcs, I had the perfect opportunity to compare it to my (and my wife s) existing cards to see how the rates played out. In particular I ve always meant to figure out if the cards that don t charge a loading rate end up with the same exchange rate as the cards that do have a separate per transaction charge. So, armed with the HSBC Global Money card, a Monzo debit card, a Nationwide debit card, and a co-operative debit card, we engage on a week of eating out in the interests of data! The results will probably shock no one who s properly looked into this themselves. All the cards were on the VISA network, they all ended up at around 1 1.15. According to Xe that was roughly the mid-market rate for the week I was away, so it seems like trusting VISA to do the currency conversion is a reasonable thing to do. The problem comes with the range of charges:
Card Exchange rate Charge %
Co-Op 1.15 2.65%
HSBC 1.15 0.00%
Monzo 1.15 0.00%
Nationwide 1.15 2.99%
This matches what Nationwide claims for its foreign transaction fees, though Co-Op claims 2.75% and the numbers looked more like 2.65% on the small transactions we incurred. So, the HSBC Global Money card is definitely a decent option. However Monzo is just a good, and has the advantage that it s a normal current account rather than something slightly different you have to funnel money into. Both are let down by the fact you need to use an app to check your balance etc - Monzo seems to only have extremely basic web access to any of their accounts and HSBC don t expose the Global Money account at all via web banking, only the phone app. Finally, what I really want is to be able to do this with a credit card, especially for things like online purchases or staying at hotels. My current credit card suffers from a per-transaction charge, but while looking up the Nationwide debit charge %age to confirm it matched what I saw I discovered that Nationwide credit cards do not charge a fee. So I think I ll be investigating that for my next trip.

30 May 2023

Russ Allbery: Review: The Mimicking of Known Successes

Review: The Mimicking of Known Successes, by Malka Older
Series: Mossa and Pleiti #1
Publisher: Tordotcom
Copyright: 2023
ISBN: 1-250-86051-2
Format: Kindle
Pages: 169
The Mimicking of Known Successes is a science fiction mystery novella, the first of an expected series. (The second novella is scheduled to be published in February of 2024.) Mossa is an Investigator, called in after a man disappears from the eastward platform on the 4 63' line. It's an isolated platform, five hours away from Mossa's base, and home to only four residential buildings and a pub. The most likely explanation is that the man jumped, but his behavior before he disappeared doesn't seem consistent with that theory. He was bragging about being from Valdegeld University, talking to anyone who would listen about the important work he was doing not typically the behavior of someone who is suicidal. Valdegeld is the obvious next stop in the investigation. Pleiti is a Classics scholar at Valdegeld. She is also Mossa's ex-girlfriend, making her both an obvious and a fraught person to ask for investigative help. Mossa is the last person she expected to be waiting for her on the railcar platform when she returns from a trip to visit her parents. The Mimicking of Known Successes is mostly a mystery, following Mossa's attempts to untangle the story of what happened to the disappeared man, but as you might have guessed there's a substantial sapphic romance subplot. It's also at least adjacent to Sherlock Holmes: Mossa is brilliant, observant, somewhat monomaniacal, and very bad at human relationships. All of this story except for the prologue is told from Pleiti's perspective as she plays a bit of a Watson role, finding Mossa unreadable, attractive, frustrating, and charming in turn. Following more recent Holmes adaptations, Mossa is portrayed as probably neurodivergent, although the story doesn't attach any specific labels. I have no strong opinions about this novella. It was fine? There's a mystery with a few twists, there's a sapphic romance of the second chance variety, there's a bit of action and a bit of hurt/comfort after the action, and it all felt comfortably entertaining but kind of predictable. Susan Stepney has a "passes the time" review rating, and while that may be a bit harsh, that's about where I ended up. The most interesting part of the story is the science fiction setting. We're some indefinite period into the future. Humans have completely messed up Earth to the point of making it uninhabitable. We then took a shot at terraforming Mars and messed that planet up to the point of uninhabitability as well. Now, what's left of humanity (maybe not all of it the story isn't clear) lives on platforms connected by rail lines high in the atmosphere of Jupiter. (Everyone in the story calls Jupiter "Giant" for reasons that I didn't follow, given that they didn't rename any of its moons.) Pleiti's position as a Classics scholar means that she studies Earth and its now-lost ecosystems, whereas the Modern faculty focus on their new platform life. This background does become relevant to the mystery, although exactly how is not clear at the start. I wouldn't call this a very realistic setting. One has to accept that people are living on platforms attached to artificial rings around the solar system's largest planet and walk around in shirt sleeves and only minor technological support due to "atmoshields" of some unspecified capability, and where the native atmosphere plays the role of London fog. Everything feels vaguely Edwardian, including to the occasional human porter and message runner, which matches the story concept but seems unlikely as a plausible future culture. I also disbelieve in humanity's ability to do anything to Earth that would make it less inhabitable than the clouds of Jupiter. That said, the setting is a lot of fun, which is probably more important. It's fun to try to visualize, and it has that slightly off-balance, occasionally surprising feel of science fiction settings where everyone is recognizably human but the things they consider routine and unremarkable are unexpected by the reader. This novella also has a great title. The Mimicking of Known Successes is simultaneously a reference a specific plot point from late in the story, a nod to the shape of the romance, and an acknowledgment of the Holmes pastiche, and all of those references work even better once you know what the plot point is. That was nicely done. This was not very memorable apart from the setting, but it was pleasant enough. I can't say that I'm inspired to pre-order the next novella in this series, but I also wouldn't object to reading it. If you're in the mood for gender-swapped Holmes in an exotic setting, you could do worse. Followed by The Imposition of Unnecessary Obstacles. Rating: 6 out of 10

27 April 2023

Arturo Borrero Gonz lez: Kubecon and CloudNativeCon 2023 Europe summary

Post logo This post serves as a report from my attendance to Kubecon and CloudNativeCon 2023 Europe that took place in Amsterdam in April 2023. It was my second time physically attending this conference, the first one was in Austin, Texas (USA) in 2017. I also attended once in a virtual fashion. The content here is mostly generated for the sake of my own recollection and learnings, and is written from the notes I took during the event. The very first session was the opening keynote, which reunited the whole crowd to bootstrap the event and share the excitement about the days ahead. Some astonishing numbers were announced: there were more than 10.000 people attending, and apparently it could confidently be said that it was the largest open source technology conference taking place in Europe in recent times. It was also communicated that the next couple iteration of the event will be run in China in September 2023 and Paris in March 2024. More numbers, the CNCF was hosting about 159 projects, involving 1300 maintainers and about 200.000 contributors. The cloud-native community is ever-increasing, and there seems to be a strong trend in the industry for cloud-native technology adoption and all-things related to PaaS and IaaS. The event program had different tracks, and in each one there was an interesting mix of low-level and higher level talks for a variety of audience. On many occasions I found that reading the talk title alone was not enough to know in advance if a talk was a 101 kind of thing or for experienced engineers. But unlike in previous editions, I didn t have the feeling that the purpose of the conference was to try selling me anything. Obviously, speakers would make sure to mention, or highlight in a subtle way, the involvement of a given company in a given solution or piece of the ecosystem. But it was non-invasive and fair enough for me. On a different note, I found the breakout rooms to be often small. I think there were only a couple of rooms that could accommodate more than 500 people, which is a fairly small allowance for 10k attendees. I realized with frustration that the more interesting talks were immediately fully booked, with people waiting in line some 45 minutes before the session time. Because of this, I missed a few important sessions that I ll hopefully watch online later. Finally, on a more technical side, I ve learned many things, that instead of grouping by session I ll group by topic, given how some subjects were mentioned in several talks. On gitops and CI/CD pipelines Most of the mentions went to FluxCD and ArgoCD. At that point there were no doubts that gitops was a mature approach and both flux and argoCD could do an excellent job. ArgoCD seemed a bit more over-engineered to be a more general purpose CD pipeline, and flux felt a bit more tailored for simpler gitops setups. I discovered that both have nice web user interfaces that I wasn t previously familiar with. However, in two different talks I got the impression that the initial setup of them was simple, but migrating your current workflow to gitops could result in a bumpy ride. This is, the challenge is not deploying flux/argo itself, but moving everything into a state that both humans and flux/argo can understand. I also saw some curious mentions to the config drifts that can happen in some cases, even if the goal of gitops is precisely for that to never happen. Such mentions were usually accompanied by some hints on how to operate the situation by hand. Worth mentioning, I missed any practical information about one of the key pieces to this whole gitops story: building container images. Most of the showcased scenarios were using pre-built container images, so in that sense they were simple. Building and pushing to an image registry is one of the two key points we would need to solve in Toolforge Kubernetes if adopting gitops. In general, even if gitops were already in our radar for Toolforge Kubernetes, I think it climbed a few steps in my priority list after the conference. Another learning was this site: https://opengitops.dev/. Group On etcd, performance and resource management I attended a talk focused on etcd performance tuning that was very encouraging. They were basically talking about the exact same problems we have had in Toolforge Kubernetes, like api-server and etcd failure modes, and how sensitive etcd is to disk latency, IO pressure and network throughput. Even though Toolforge Kubernetes scale is small compared to other Kubernetes deployments out there, I found it very interesting to see other s approaches to the same set of challenges. I learned how most Kubernetes components and apps can overload the api-server. Because even the api-server talks to itself. Simple things like kubectl may have a completely different impact on the API depending on usage, for example when listing the whole list of objects (very expensive) vs a single object. The conclusion was to try avoiding hitting the api-server with LIST calls, and use ResourceVersion which avoids full-dumps from etcd (which, by the way, is the default when using bare kubectl get calls). I already knew some of this, and for example the jobs-framework-emailer was already making use of this ResourceVersion functionality. There have been a lot of improvements in the performance side of Kubernetes in recent times, or more specifically, in how resources are managed and used by the system. I saw a review of resource management from the perspective of the container runtime and kubelet, and plans to support fancy things like topology-aware scheduling decisions and dynamic resource claims (changing the pod resource claims without re-defining/re-starting the pods). On cluster management, bootstrapping and multi-tenancy I attended a couple of talks that mentioned kubeadm, and one in particular was from the maintainers themselves. This was of interest to me because as of today we use it for Toolforge. They shared all the latest developments and improvements, and the plans and roadmap for the future, with a special mention to something they called kubeadm operator , apparently capable of auto-upgrading the cluster, auto-renewing certificates and such. I also saw a comparison between the different cluster bootstrappers, which to me confirmed that kubeadm was the best, from the point of view of being a well established and well-known workflow, plus having a very active contributor base. The kubeadm developers invited the audience to submit feature requests, so I did. The different talks confirmed that the basic unit for multi-tenancy in kubernetes is the namespace. Any serious multi-tenant usage should leverage this. There were some ongoing conversations, in official sessions and in the hallway, about the right tool to implement K8s-whitin-K8s, and vcluster was mentioned enough times for me to be convinced it was the right candidate. This was despite of my impression that multiclusters / multicloud are regarded as hard topics in the general community. I definitely would like to play with it sometime down the road. On networking I attended a couple of basic sessions that served really well to understand how Kubernetes instrumented the network to achieve its goal. The conference program had sessions to cover topics ranging from network debugging recommendations, CNI implementations, to IPv6 support. Also, one of the keynote sessions had a reference to how kube-proxy is not able to perform NAT for SIP connections, which is interesting because I believe Netfilter Conntrack could do it if properly configured. One of the conclusions on the CNI front was that Calico has a massive community adoption (in Netfilter mode), which is reassuring, especially considering it is the one we use for Toolforge Kubernetes. Slide On jobs I attended a couple of talks that were related to HPC/grid-like usages of Kubernetes. I was truly impressed by some folks out there who were using Kubernetes Jobs on massive scales, such as to train machine learning models and other fancy AI projects. It is acknowledged in the community that the early implementation of things like Jobs and CronJobs had some limitations that are now gone, or at least greatly improved. Some new functionalities have been added as well. Indexed Jobs, for example, enables each Job to have a number (index) and process a chunk of a larger batch of data based on that index. It would allow for full grid-like features like sequential (or again, indexed) processing, coordination between Job and more graceful Job restarts. My first reaction was: Is that something we would like to enable in Toolforge Jobs Framework? On policy and security A surprisingly good amount of sessions covered interesting topics related to policy and security. It was nice to learn two realities:
  1. kubernetes is capable of doing pretty much anything security-wise and create greatly secured environments.
  2. it does not by default. The defaults are not security-strict on purpose.
It kind of made sense to me: Kubernetes was used for a wide range of use cases, and developers didn t know beforehand to which particular setup they should accommodate the default security levels. One session in particular covered the most basic security features that should be enabled for any Kubernetes system that would get exposed to random end users. In my opinion, the Toolforge Kubernetes setup was already doing a good job in that regard. To my joy, some sessions referred to the Pod Security Admission mechanism, which is one of the key security features we re about to adopt (when migrating away from Pod Security Policy). I also learned a bit more about Secret resources, their current implementation and how to leverage a combo of CSI and RBAC for a more secure usage of external secrets. Finally, one of the major takeaways from the conference was learning about kyverno and kubeaudit. I was previously aware of the OPA Gatekeeper. From the several demos I saw, it was to me that kyverno should help us make Toolforge Kubernetes more sustainable by replacing all of our custom admission controllers with it. I already opened a ticket to track this idea, which I ll be proposing to my team soon. Final notes In general, I believe I learned many things, and perhaps even more importantly I re-learned some stuff I had forgotten because of lack of daily exposure. I m really happy that the cloud native way of thinking was reinforced in me, which I still need because most of my muscle memory to approach systems architecture and engineering is from the old pre-cloud days. List of sessions I attended on the first day: List of sessions I attended on the second day: List of sessions I attended on third day: The videos have been published on Youtube.

Next.

Previous.